scispace - formally typeset
Search or ask a question

Showing papers on "Fuzzy clustering published in 1989"


Journal ArticleDOI
TL;DR: The unsupervised fuzzy partition-optimal number of classes algorithm performs well in situations of large variability of cluster shapes, densities, and number of data points in each cluster.
Abstract: This study reports on a method for carrying out fuzzy classification without a priori assumptions on the number of clusters in the data set. Assessment of cluster validity is based on performance measures using hypervolume and density criteria. An algorithm is derived from a combination of the fuzzy K-means algorithm and fuzzy maximum-likelihood estimation. The unsupervised fuzzy partition-optimal number of classes algorithm performs well in situations of large variability of cluster shapes, densities, and number of data points in each cluster. The algorithm was tested on different classes of simulated data, and on a real data set derived from sleep EEG signal. >

1,691 citations


Journal ArticleDOI
TL;DR: New relational versions of the hard and fuzzy c-means algorithms are presented here for the case when the relational data can reasonably be viewed as some measure of distance.

308 citations


Journal ArticleDOI
28 Aug 1989
TL;DR: Two parallel clustering algorithms are presented and the time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN2) of the uniprocessor algorithm to O(nN) with MN processors.
Abstract: Clustering techniques play an important role in exploratory pattern analysis, unsupervised learning and image segmentation applications. Many clustering algorithms, both partitional clustering and hierarchical clustering, require intensive computation, even for a modest number of patterns. This paper presents two parallel clustering algorithms. For a clustering problem with N = 2n patterns and M = 2m features, the time complexity of the traditional partitional clustering algorithm on a single processor computer is O(MNK), where K is the number of clusters. The proposed algorithm on anSIMD computer with MN processors has a time complexity O(K(n + m)). The time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN2) of the uniprocessor algorithm to O(nN) with MN processors.

118 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a linear clustering algorithm based on the calculation of a commonality score which indicates the similarity in the way two machines are used in the shop to manufacture the products or parts.
Abstract: Numerous researchers have suggested methods for clustering machines into manufacturing cells in a Group Technology environment. Many of these methods are numerically complex. This paper presents a new linear clustering algorithm that is fast, simple and quite flexible. The algorithm is based on the calculation of a commonality score which indicates the similarity in the way two machines are used in the shop to manufacture the products or parts.

118 citations


Journal ArticleDOI
TL;DR: In this article, the authors report the development of a number of similarity-based coefficients designed for applying hierarchical cluster analysis to the group technology machine cell formation problem, and also discuss an experimental investigation applying these and other well-known similarity coefficients in conjunction with some well known clustering algorithms.
Abstract: This paper reports the development of a number of similarity-based coefficients designed for applying hierarchical cluster analysis to the group technology machine cell formation problem. The paper also discusses an experimental investigation applying these and other well-known similarity coefficients in conjunction with some well-known clustering algorithms. The mixture model experimental approach is used for the investigation. A number of problems were generated via simulation, randomly ‘mixed’ to hide the original cellular structure, and the clustering techniques applied. Extensions of prior research include the development of new similarity coefficients, their comparative evaluation, and the incorporation of the concept of part ‘weighting’ into the cluster analysis, and hence, cell formation

118 citations


Journal ArticleDOI
TL;DR: A new and apparently rather useful and natural concept in cluster analysis is studied: given a similarity measure on a set of objects, a sub-set is regarded as a cluster if any two objects a, b inside this sub- set have greater similarity than any third object outside has to at least one of a,b.

103 citations


Journal ArticleDOI
TL;DR: A method is described, based on clustering, for estimating the parameters of a finite mixture of normal distributions based on fuzzy hypervolume and density criteria, which incorporates unsupervised tracking of initial cluster centers during its first stage.

74 citations


Journal ArticleDOI
TL;DR: A bootstrap-based procedure is developed for obtaining approximate confidence bounds on the number of clusters in the “best” clustering and it is shown that a sample version of the loss function and optimal clustering converge strongly to their theoretical counterparts as the sample size tends to infinity.
Abstract: We consider clustering for the purpose of data reduction. Similar objects are grouped together in clusters so that one can then work with the few cluster descriptors instead of the many data points. The quality of any given clustering is measured by a loss function that takes into account both the parsimony of the clustering and the loss of information due to clustering. An optimal clustering can be obtained by minimizing the theoretical loss function. It is shown that a sample version of the loss function and optimal clustering converge strongly to their theoretical counterparts as the sample size tends to infinity. We then develop a bootstrap-based procedure for obtaining approximate confidence bounds on the number of clusters in the “best” clustering. The effectiveness of this procedure is evaluated in a simulation study. An application is presented.

31 citations


Journal ArticleDOI
TL;DR: A new method for the sequential clustering of data, that gives better results compared to the conventional sequential clustered method, is presented in this paper.

26 citations


Journal ArticleDOI
01 Jan 1989
TL;DR: Two approaches for the formulation of information through fuzzy associations are presented and a fuzzy association is introduced as a fuzzy relation defined on a set of indices to a database.
Abstract: Two approaches for the formulation of information through fuzzy associations are presented. A fuzzy association is introduced as a fuzzy relation defined on a set of indices to a database. One approach is the extension of fuzzy indices to a database using fuzzy associations. A fuzzy association is a generalization of the concept of fuzzy thesauri. An algorithm for fuzzy information retrieval based on this approach is developed. The other approach represents the retrieval process as a block diagram. Maximum and minimum operations are used instead of the ordinary sum and product operations on the diagram. Studies of advanced indexing, such as the clustering of articles, are represented as feedback on the diagram. Properties of fuzzy information retrieval, such as level fuzzy sets and set operations on responses of the retrieval system, are discussed using the diagram representation. >

26 citations


Journal ArticleDOI
TL;DR: The central purpose of this paper is to put together the basic ideas of two separate theories - the theory of ordinal clustering, as developed by Janowitz et al., and the Theory of probabilistic metric spaces, asdeveloped by Schweizer et al - into a single theory, called percentile clustering.

Journal ArticleDOI
TL;DR: A new approach to clustering is developed, which is based on detecting the boundaries of the modes of the underlying probability density function, which calls for a generalization of the concepts of boundary to multidimensional functions.

Journal ArticleDOI
28 Aug 1989
TL;DR: PFCM, a parallel algorithm for fuzzy clustering of large data sets, is presented, being a generalization of FCM, which enables arbitrary numbers of data points, features and clusters to be handled cost-optimally by hypercube SIMD computers of arbitrary cube dimension.
Abstract: This article presents PFCM, a parallel algorithm for fuzzy clustering of large data sets. Being a generalization of FCM, the algorithm enables arbitrary numbers of data points, features and clusters to be handled cost-optimally by hypercube SIMD computers of arbitrary cube dimension, the only limitation being the size of the local memories of the processors. Speedup responds optimally to enlarging the hypercube. PFCM owes its flexibility to the technique employed in its derivation from the sequential fuzzy C-means algorithm FCM: the association of each of the three dimensions of the problem (numbers of data points, features and clusters) with a distinct subset of hypercube dimensions.

Journal ArticleDOI
TL;DR: The sensitivity of several properties of vacancy clusters on the degree of fuzziness is discussed, and fuzzy analysis is suggested as a tool to establish the relation between measures at different scales of the same phenomenon.
Abstract: Representations based on the concepts of the theory of fuzzy sets are suggested to apply to a wide variety of problems in physics. A grade of membership is associated to each element in a set, which is a measure of its distance to a prototype. Fuzzy representations are thus adequate for dealing with situations where the belongingness of an object or a phenomenon to a class is uncertain, or to situations where the classes have no exact definition. An explicit relation is shown between fuzzy representation and dimensionality. Unambiguous definitions of the degree of fuzziness, cluster overlap, and isolated points are given on the basis of an anisotropic grade of membership function. The example is treated of collision cascades generated by xenon atoms incident on a polycrystalline gold surface with energies ranging from 20 keV to 1 MeV. The cascades are simulated in the binary collision approximation with the Marlowe computer code. They are shown to germinate from simultaneously growing collisions clusters. The displacement cascades are found to be only partially space filling. This is emphasized on the basis of their fuzzy geometrical characteristics, without need of any assumption concerning self-similarity. Their possible overlap and lumping are identified on themore » basis of the grade of membership of each vacated lattice site to each cluster. The final cluster pattern of the vacancy distributions is shown to depend on the degree of fuzziness. The sensitivity of several properties of vacancy clusters on the degree of fuzziness is discussed. This sensitivity is suggested to be a consequence of their granular structure. Consequently, their experimental characterization may be influenced by the resolution of the observation method. Fuzzy analysis is suggested as a tool to establish the relation between measures at different scales of the same phenomenon.« less

Journal ArticleDOI
TL;DR: A new fuzzy clustering method is put forward using the idea to find a fuzzy equivalent matrix R# which is closest to R by a certain ‘distance’.

Journal ArticleDOI
TL;DR: Syntactic pattern recognition approach was applied to classification of EEG segments belonging to sleep stage I and stage REM, using adaptive segmentation of the EEG signal and the process of primitive recognition was implemented through unsupervised fuzzy clustering.

Proceedings ArticleDOI
01 Nov 1989
TL;DR: This work proposes a model-fitting approach to the cluster validation problem based upon Akaike's Information Criterion (AIC), and demonstrates the efficacy and robustness of the proposed approach through experimental results for both synthetic mixture data and image data.
Abstract: An unsupervised stochastic model-based image segmentation technique requires the model parameters for the various image classes in an observed image to be estimated directly from the image. In this work, a clustering scheme is used for the model parameter estimation. Most of the existing clustering procedures require prior knowledge of the number of classes which is often, as in unsupervised image segmentation, unavailable and has to be estimated. The problem of determining the number of classes directly from observed data is known as the cluster validation problem. For unsupervised image segmentation, the solution of this problem directly affects the quality of the segmentation. In this work, we propose a model-fitting approach to the cluster validation problem based upon Akaike's Information Criterion (AIC). The explicit evaluation of the AIC is achieved through an approximate maximum-likelihood (ML) estimation algorithm. We demonstrate the efficacy and robustness of the proposed approach through experimental results for both synthetic mixture data and image data.

Patent
29 Dec 1989
TL;DR: In this paper, the authors proposed a method to perform image processing with few erroneous recognition and high accuracy by extracting plural feature quantity from input image data, performing first evaluation based on fuzzy clustering, and performing second evaluation with another method.
Abstract: PURPOSE: To perform image processing with few erroneous recognition and high accuracy by extracting plural feature quantity from input image data, performing first evaluation based on fuzzy clustering, and performing second evaluation with another method. CONSTITUTION: When the images of red R, green G, and blue B are inputted from a CCD camera 101, they are digitized, and are uniformalized, and furthermore, noises are eliminated from them, then, areas for the extraction of the feature quantity are extracted. A feature quantity extraction part 112 extracts the feature quantity with respect to the information of a position, brightness, and shape, etc., from the extraction area. The matching of the feature quantity of position information with a reference pattern from a reference pattern generating part 115 is performed at an arithmetic part 113 with a reverse truth value limiting method. Meanwhile, the matching of the feature quantity with respect to brightness and shape information with the reference pattern is performed at an arithmetic part 114 with the fuzzy clustering. Those two computed results are coupled and synthesized at a coupling part 116 with the rule of Dempster- Shafer and a judged result with high assurance can be outputted from an output part 117. COPYRIGHT: (C)1991,JPO&Japio


Book ChapterDOI
Jakub Segen1
01 Dec 1989
TL;DR: An incremental method of conceptual clustering for continuously valued data, which minimizes a cost function of a cluster configuration, as the length of a reconstructive representation of data with the aid of clusters is described.
Abstract: We describe an incremental method of conceptual clustering for continuously valued data, which minimizes a cost function of a cluster configuration. This function is defined as the length of a reconstructive representation of data with the aid of clusters. The clustering program inserts each new instance to one of the clusters, updates the parameters of this cluster, and possibly divides it into smaller clusters. The program uses a novel prediction mechanism to decide when dividing a cluster might decrease the configuration cost.

Proceedings ArticleDOI
04 Jun 1989
TL;DR: The author proposes a parallel algorithm to compute one type of cluster validity measure global fit of hierarchy for quantitative data, which requires intensive computation and large memory storage on single-instruction/multiple-data (SIMD) machines for hierarchical clustering.
Abstract: Several parallel algorithms and parallel architectures have been developed for partitional clustering. Hierarchical clustering algorithms are also widely used in exploratory pattern analysis and unsupervised learning. The author proposes parallel algorithms on single-instruction/multiple-data (SIMD) machines for hierarchical clustering, which require intensive computation and large memory storage. The machine model includes a parallel memory system and an alignment network, to facilitate parallel access of both pattern matrix and proximity matrix. Since clustering algorithms tend to generate clusters even when applied to random data, clustering-tendency and cluster-validity studies are usually performed. The author proposes a parallel algorithm to compute one type of cluster validity measure global fit of hierarchy for quantitative data. For a problem with N patterns, considering validity study as well as clustering, the number of memory accesses is reduced from O(/sup 3/) on a sequential machine to O(N/sup 2/) on a SIMD machine with N processing elements (PEs). More general algorithms for different numbers of PEs are also given. >

Proceedings ArticleDOI
10 Apr 1989
TL;DR: An approach to segmenting a gray-level image is presented and used to reconstruct the three-dimensional shape of unconstrained objects and the edge detection problem is discussed as an inverse problem to clustering.
Abstract: An approach to segmenting a gray-level image is presented and used to reconstruct the three-dimensional shape of unconstrained objects In analyzing a natural image, the authors cannot utilize heuristic rules that constrain the degree of freedom of reconstruction of the scene Therefore they use local feature-based clustering, which utilizes the local distribution of the features This clustering is based on an image itself and is considered as object-oriented processing The edge detection problem is discussed as an inverse problem to clustering Clustering methods which utilize both lowest features (for all pixels) and the features a little higher up are discussed with respect to their ability for exact segmentation >

Journal ArticleDOI
TL;DR: A fuzzy clustering algorithm is applied to grey tones images, while fixing the number of classes to be formed, which brings out first non-disjointed clusters of pixels with a similar grey tone.

Journal Article
TL;DR: The method employs an iterative clustering routine which increases the number of clustered words and performance is determined from the average clustering ratio and the average cluster uniformity.
Abstract: The method employs an iterative clustering routine which increases the number of clustered words. Thus, evaluations are achieved as a function of the number of iterations of the clustering routine from the aspects of (a) clustering characteristics determined from the number of clustered words, the number of clusters formed, etc., and (b) performance determined from the average clustering ratio and the average cluster uniformity. Consequently, the applicability of the method to English and Japanese is obtained through evaluations indicating similarities between them for both clustering characteristics and performance

01 Jan 1989
TL;DR: In this paper, a K-NN fuzzy clasificator was used to detect new cases of ventricular arrhythmias with a high degree of reliability on a training set of 90 ECG registers.
Abstract: We have faced the detection of life threatening ventricular arrythmias applaing statistical techniques on a training set of 90 ECG registers. After the phase of properties extracting each one of these registers is characterized by a vector composed of 7 spectral characteristics. Because we work on a small sets of samples, they are not representatives of the probability distributions, and because the fact that we work with imprecisely defined categories, we have considered fuzzy clasificators which are based on K-NN rules in order to obtain betterresults. Labels adscriptionon the training set is carried out using fuzzy clustering algorithm (fuzzy C-means and fuzzy covariance algorithms). The fuzzy covariance algorithm shows clusters associated to categories of ECG registers. This information is the base for a K-NN fuzzy clasificator which detects new cases of ventricular arrhythmias with a high degree of reliability.

Book ChapterDOI
01 Jan 1989
TL;DR: By introducing an idea of prototype theory from the psychological domain with respect to human category formation, an alternative methodology of conceptual clustering is presented and using the schematically-modeled example, the algorithm is illustrated as well as the clustering results.
Abstract: Human expert decision makers can be characterized by their ability to perceive a hypothetical conceptual pattern underlying a given collection of objects. The conventional cluster analysis is insufficient to generate such patterns since its clustering process is far from what the human decision makers actually do in inductively forming some concepts from individual observations based on the “meaning” of the objects and the clusters. In this paper, by introducing an idea of prototype theory from the psychological domain with respect to human category formation, an alternative methodology of conceptual clustering is presented. The algorithm can be roughly divided into two phases; an inductive prototype formation from training samples in a bottom-up way and a pattern-directed clustering of the instances being affected by the acquired concepts in a top-down fashion. Using the schematically-modeled example, the algorithm is illustrated as well as the clustering results.

DOI
21 Feb 1989
TL;DR: The case of hierarchical clustering is examined, the number of classes of topologically equivalent dendrograms is calculated and two algorithms for finding representatives from each class are presented.
Abstract: An integral part of clustering is the study of goodness of fit of the clustering imposed by a method on a set of objects. Many measures were introduced to achieve this ([1], [2]). In order to perform hypothesis testing on this measures, we need to know their distributions. To This end, we examine the case of hierarchical clustering, calculate the number of classes of topologically equivalent dendrograms and present two algorithms for finding representatives from each class.

Journal ArticleDOI
TL;DR: In this article, a goal-oriented conceptual clustering method for acquiring knowledge in a real-time production command system is described, which makes use of background knowledge, concept attributes of knowledge, and clustering goals in a clustering knowledge base to infer clustering attributes.

Book ChapterDOI
O. Opitz1, R. Wiedemann1
01 Jan 1989
TL;DR: This work obtains the problem how to limit overlappings by determining the determination of partitions representing the classification structure of data that given natural overlap-pings should not be suppressed.
Abstract: Cluster analysis is especially concerned in algorithms for computing partitions or hierarchies on given object sets. For several economic problems and other areas however the determination of partitions representing the classification structure of data is too specific and narrow. In opposition to that given natural overlap-pings should not be suppressed. Doing so we obtain the problem how to limit overlappings.