scispace - formally typeset
Search or ask a question

Showing papers on "Fuzzy clustering published in 1985"


Journal ArticleDOI
TL;DR: A Monte Carlo evaluation of 30 procedures for determining the number of clusters was conducted on artificial data sets which contained either 2, 3, 4, or 5 distinct nonoverlapping clusters to provide a variety of clustering solutions.
Abstract: A Monte Carlo evaluation of 30 procedures for determining the number of clusters was conducted on artificial data sets which contained either 2, 3, 4, or 5 distinct nonoverlapping clusters. To provide a variety of clustering solutions, the data sets were analyzed by four hierarchical clustering methods. External criterion measures indicated excellent recovery of the true cluster structure by the methods at the correct hierarchy level. Thus, the clustering present in the data was quite strong. The simulation results for the stopping rules revealed a wide range in their ability to determine the correct number of clusters in the data. Several procedures worked fairly well, whereas others performed rather poorly. Thus, the latter group of rules would appear to have little validity, particularly for data sets containing distinct clusters. Applied researchers are urged to select one or more of the better criteria. However, users are cautioned that the performance of some of the criteria may be data dependent.

3,551 citations


Journal ArticleDOI
TL;DR: A clustering algorithm based on a standard K-means approach which requires no user parameter specification is presented and experimental data show that this new algorithm performs as well or better than the previously used clustering techniques when tested as part of a speaker-independent isolated word recognition system.
Abstract: Studies of isolated word recognition systems have shown that a set of carefully chosen templates can be used to bring the performance of speaker-independent systems up to that of systems trained to the individual speaker. The earliest work in this area used a sophisticated set of pattern recognition algorithms in a human-interactive mode to create the set of templates (multiple patterns) for each word in the vocabulary. Not only was this procedure time consuming but it was impossible to reproduce exactly because it was highly dependent on decisions made by the experimenter. Subsequent work led to an automatic clustering procedure which, given only a set of clustering parameters, clustered patterns with the same performance as the previously developed supervised algorithms. The one drawback of the automatic procedure was that the specification of the input parameter set was found to be somewhat dependent on the vocabulary type and size of population to be clustered. Since a naive user of such a statistical clustering algorithm could not be expected, in general, to know how to choose the word clustering parameters, even this automatic clustering algorithm was not appropriate for a completely general word recognition system. It is the purpose of this paper to present a clustering algorithm based on a standard K-means approach which requires no user parameter specification. Experimental data show that this new algorithm performs as well or better than the previously used clustering techniques when tested as part of a speaker-independent isolated word recognition system.

218 citations


Journal ArticleDOI
TL;DR: Methods of fuzzy clustering based minimization of a scalar performance index with the aid of some labelled patterns and some modifications of the performance index that take into account the results of partial supervision are proposed.

147 citations


Journal ArticleDOI
TL;DR: This work has led to the development of a iterative fuzzy clustering technique which represents an image segmentation scheme which can be used as a preprocessor for a multivalued logic based computer vision system.

144 citations


Journal ArticleDOI
TL;DR: A clustering algorithm making use of some properties of Sugeno's g λ measure is presented and its performance, when run on the well-known set of the iris data, is briefly described.

108 citations


Proceedings Article
18 Aug 1985
TL;DR: It is clarified that conceptual clustering processes can be explicated as being composed of three distinct but inter-dependent subprocesses, each of which may be characterized along a number of dimensions related to search, thus facilitating a better understanding of the conceptual clusters process as a whole.
Abstract: Methods for Conceptual Clustering may be explicated in two lights. Conceptual Clustering methods may be viewed as extensions to techniques of numerical taxonomy, a collection of methods developed by social and natural scientists for creating classification schemes over object sets. Alternatively, conceptual clustering may be viewed as a form of learning by observation or concept formation, as opposed to methods of learning from examples or concept identification. In this paper we survey and compare a number of conceptual clustering methods along dimensions suggested by each of these views. The point we most wish to clarify is that conceptual clustering processes can be explicated as being composed of three distinct but inter-dependent subprocesses: the process of deriving a hierarchical classification scheme; the process of aggregating objects into individual classes; and the process of assigning conceptual descriptions to object classes. Each subprocess may be characterized along a number of dimensions related to search, thus facilitating a better understanding of the conceptual clustering process as a whole.

100 citations


Journal ArticleDOI
TL;DR: It is shown that by appropriate specification of the underlying model, the mixture maximum likelihood approach to clustering can be applied in the context of a three-way table and is illustrated using a soybean data set which consists of multiattribute measurements on a number of genotypes each grown in several environments.
Abstract: Clustering or classifying individuals into groups such that there is relative homogeneity within the groups and heterogeneity between the groups is a problem which has been considered for many years. Most available clustering techniques are applicable only to a two-way data set, where one of the modes is to be partitioned into groups on the basis of the other mode. Suppose, however, that the data set is three-way. Then what is needed is a multivariate technique which will cluster one of the modes on the basis of both of the other modes simultaneously. It is shown that by appropriate specification of the underlying model, the mixture maximum likelihood approach to clustering can be applied in the context of a three-way table. It is illustrated using a soybean data set which consists of multiattribute measurements on a number of genotypes each grown in several environments. Although the problem is set in the framework of clustering genotypes, the technique is applicable to other types of three-way data sets.

86 citations


Proceedings ArticleDOI
05 Jun 1985
TL;DR: The cover coefficient based clustering methodology has been introduced and certain new concepts, relationships, and measures such as the effect of indexing on clustering, an optimal vocabulary generation for indexing, and a new matching function are discussed.
Abstract: Document clustering has several unresolved problems. Among them are high time and space complexity, difficulty of determining similarity thresholds, order dependence, nonuniform document distribution in clusters, and arbitrariness in determination of various cluster intiators. To overcome these problems to some degree, the cover coefficient based clustering methodology has been introduced. The concepts used in this methodology have created certain new concepts, relationships, and measures such as the effect of indexing on clustering, an optimal vocabulary generation for indexing, and a new matching function. These new concepts are discussed. The result of performance experiments that show the effectiveness of the clustering methodology and the matching function are also included. In these experiments, it has been also observed that the majority of the documents obtained in a search are concentrated in a few clusters containing a low percentage of documents of the database.

17 citations


Journal ArticleDOI
TL;DR: To enable PLL methods to be used when the numbern of objects being clustered is large, this work describes an efficient PLL algorithm that operates inO(n2 logn) time andO( n2) space.
Abstract: Proportional link linkage (PLL) clustering methods are a parametric family of monotone invariant agglomerative hierarchical clustering methods. This family includes the single, minimedian, and complete linkage clustering methods as special cases; its members are used in psychological and ecological applications. Since the literature on clustering space distortion is oriented to quantitative input data, we adapt its basic concepts to input data with only ordinal significance and analyze the space distortion properties of PLL methods. To enable PLL methods to be used when the numbern of objects being clustered is large, we describe an efficient PLL algorithm that operates inO(n 2 logn) time andO(n 2) space.

15 citations


22 Jul 1985
TL;DR: Artificial Intelligence (AI) methods for machine learning can be viewed as forms of exploratory data analysis, even though they differ markedly from the statistical methods generally connoted by the term.
Abstract: : Artificial Intelligence (AI) methods for machine learning can be viewed as forms of exploratory data analysis, even though they differ markedly from the statistical methods generally connoted by the term. The distinction between methods of machine learning and statistical data analysis is primarily due to differences in the way techniques of each type represent data and structure within data. That is, methods of machine learning are strongly biased toward (symbolic as opposed to numeric) data representations. We explore this difference within a limited context, devoting the bulk of our paper to the explication of conceptual clustering, and extension to the statistically based methods of numerical taxonomy. In conceptual clustering he formation of object clusters is dependent on the quality of 'higher-level' characterizations, termed concepts, of the clusters. The form of concepts used by existing conceptual clustering systems (sets of necessary and sufficient conditions) is described in some detail. This is followed by descriptions of several conceptual clustering techniques, along with sample output. We conclude with a discussion of how alternative concept representations might enhance the effectiveness of future conceptual clustering systems. Keywords: Conceptual clustering; Concept formation; Hierarchical classification; Numerical taxonomy; Heuristic search; Exploratory data analysis.

14 citations


Journal ArticleDOI
TL;DR: This technique might show great promise when applicated to geochemical exploration problems, and comparison of the results of fuzzy clustering with conventional clustering using a set of hypothetical data is made.

Journal ArticleDOI
TL;DR: An automatic method performing selective averaging of visual evoked potentials depending on the state of the EEG background activity, based on adaptive segmentation of theEEG signal and on fuzzy clustering is described.


Journal ArticleDOI
TL;DR: Two clustering methods, GLC and OUPIC, are introduced as tight-pattern clustering techniques and the decisions of loose-pattern assigning classes are related to a heuristic membership function.
Abstract: A loose-pattern process approach to clustering sets consists of three main computations: loose-pattern reject option, tight-pattern classifcation, and loose-pattern assigning classes. The loose-pattern rejection is implemented using a rule based on q nearest neighbors of each point. Two clustering methods, GLC and OUPIC, are introduced as tight-pattern clustering techniques. The decisions of loose-pattern assigning classes are related to a heuristic membership function. The function and experiments with one set is discussed.

Journal ArticleDOI
TL;DR: An efficient divisive clustering technique based on hierarchical partitioning of space is proposed and it may be executed in O(N) time at each level of hierarchy.