Top 480 papers published in the topic of Cluster analysis in 1994

Journal Article•DOI•

Fuzzy Model Identification Based on Cluster Estimation

[...]

01 May 1994-Journal of Intelligent and Fuzzy Systems

TL;DR: An efficient method for estimating cluster centers of numerical data that can be used to determine the number of clusters and their initial values for initializing iterative optimization-based clustering algorithms such as fuzzy C-means is presented.

...read moreread less

Abstract: We present an efficient method for estimating cluster centers of numerical data. This method can be used to determine the number of clusters and their initial values for initializing iterative optimization-based clustering algorithms such as fuzzy C-means. Here we use the cluster estimation method as the basis of a fast and robust algorithm for identifying fuzzy models. A benchmark problem involving the prediction of a chaotic time series shows this model identification method compares favorably with other, more computationally intensive methods. We also illustrate an application of this method in modeling the relationship between automobile trips and demographic factors.

...read moreread less

2,815 citations

Proceedings Article•

Efficient and Effective Clustering Methods for Spatial Data Mining

[...]

Raymond T. Ng¹, Jiawei Han•Institutions (1)

University of British Columbia¹

12 Sep 1994

TL;DR: The analysis and experiments show that with the assistance of CLAHANS, these two algorithms are very effective and can lead to discoveries that are difficult to find with current spatial data mining algorithms.

...read moreread less

Abstract: Spatial data mining is the discovery of interesting relationships and characteristics that may exist implicitly in spatial databases In this paper, we explore whether clustering methods have a role to play in spatial data mining To this end, we develop a new clustering method called CLAHANS which is based on randomized search We also develop two spatial data mining algorithms that use CLAHANS Our analysis and experiments show that with the assistance of CLAHANS, these two algorithms are very effective and can lead to discoveries that are difficult to find with current spatial data mining algorithms Furthermore, experiments conducted to compare the performance of CLAHANS with that of existing clustering methods show that CLAHANS is the most efficient

...read moreread less

1,999 citations

Proceedings Article•

A Growing Neural Gas Network Learns Topologies

[...]

Bernd Fritzke¹•Institutions (1)

Ruhr University Bochum¹

01 Jan 1994

TL;DR: An incremental network model is introduced which is able to learn the important topological relations in a given set of input vectors by means of a simple Hebb-like learning rule.

...read moreread less

Abstract: An incremental network model is introduced which is able to learn the important topological relations in a given set of input vectors by means of a simple Hebb-like learning rule. In contrast to previous approaches like the "neural gas" method of Martinetz and Schulten (1991, 1994), this model has no parameters which change over time and is able to continue learning, adding units and connections, until a performance criterion has been met. Applications of the model include vector quantization, clustering, and interpolation.

...read moreread less

1,806 citations

Journal Article•DOI•

Growing cell structures—a self-organizing network for unsupervised and supervised learning

[...]

Bernd Fritzke

01 Nov 1994-Neural Networks

TL;DR: A new self-organizing neural network model that has two variants that performs unsupervised learning and can be used for data visualization, clustering, and vector quantization is presented and results on the two-spirals benchmark and a vowel classification problem are presented that are better than any results previously published.

...read moreread less

1,319 citations

Book Chapter•DOI•

An examination of procedures for determining the number of clusters in a data set

[...]

André Hardy

01 Jan 1994

TL;DR: The aim of this paper is to compare three methods based on the hypervolume criterion with four other well-known methods for determining the number of clusters on artificial data sets.

...read moreread less

Abstract: A problem common to all clustering techniques is the difficulty of deciding the number of clusters present in the data. The aim of this paper is to compare three methods based on the hypervolume criterion with four other well-known methods. This evaluation of procedures for determining the number of clusters is conducted on artificial data sets. To provide a variety of solutions the data sets are analysed by six clustering methods. We finally conclude by pointing out the performance of each method and by giving some guidance for making choices between them.

...read moreread less

1,264 citations

Posted Content•

Distributional Clustering of English Words

[...]

Fernando Pereira¹, Naftali Tishby², Lillian Lee³•Institutions (3)

Bell Labs¹, Hebrew University of Jerusalem², Harvard University³

22 Aug 1994-arXiv: Computation and Language

TL;DR: Deterministic annealing is used to find lowest distortion sets of clusters: as the annealed parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft" clustering of the data.

...read moreread less

Abstract: We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical ``soft'' clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.

...read moreread less

1,024 citations

Proceedings Article•DOI•

Tree-based state tying for high accuracy acoustic modelling

[...]

Steve Young¹, JJ Odell¹, Philip C. Woodland¹•Institutions (1)

University of Cambridge¹

08 Mar 1994

TL;DR: This paper describes a method of creating a tied-state continuous speech recognition system using a phonetic decision tree, which is shown to lead to similar recognition performance to that obtained using an earlier data-driven approach but to have the additional advantage of providing a mapping for unseen triphones.

...read moreread less

Abstract: The key problem to be faced when building a HMM-based continuous speech recogniser is maintaining the balance between model complexity and available training data. For large vocabulary systems requiring cross-word context dependent modelling, this is particularly acute since many such contexts will never occur in the training data. This paper describes a method of creating a tied-state continuous speech recognition system using a phonetic decision tree. This tree-based clustering is shown to lead to similar recognition performance to that obtained using an earlier data-driven approach but to have the additional advantage of providing a mapping for unseen triphones. State-tying is also compared with traditional model-based tying and shown to be clearly superior. Experimental results are presented for both the Resource Management and Wall Street Journal tasks.

...read moreread less

781 citations

Journal Article•DOI•

DSC: scheduling parallel tasks on an unbounded number of processors

[...]

Tao Yang¹, Apostolos Gerasoulis²•Institutions (2)

University of California, Santa Barbara¹, Rutgers University²

01 Sep 1994-IEEE Transactions on Parallel and Distributed Systems

TL;DR: A low-complexity heuristic for scheduling parallel tasks on an unbounded number of completely connected processors, named the dominant sequence clustering algorithm (DSC), which guarantees a performance within a factor of 2 of the optimum for general coarse-grain DAG's.

...read moreread less

Abstract: We present a low-complexity heuristic, named the dominant sequence clustering algorithm (DSC), for scheduling parallel tasks on an unbounded number of completely connected processors. The performance of DSC is on average, comparable to, or even better than, other higher-complexity algorithms. We assume no task duplication and nonzero communication overhead between processors. Finding the optimum solution for arbitrary directed acyclic task graphs (DAG's) is NP-complete. DSC finds optimal schedules for special classes of DAG's, such as fork, join, coarse-grain trees, and some fine-grain trees. It guarantees a performance within a factor of 2 of the optimum for general coarse-grain DAG's. We compare DSC with three higher-complexity general scheduling algorithms: the ETF by J.J. Hwang, Y.C. Chow, F.D. Anger, and C.Y. Lee (1989); V. Sarkar's (1989) clustering algorithm; and the MD by M.Y. Wu and D. Gajski (1990). We also give a sample of important practical applications where DSC has been found useful. >

...read moreread less

694 citations

Journal Article•DOI•

Generation of Fuzzy Rules by Mountain Clustering

[...]

Ronald R. Yager¹, Dimitar Petrov Filev¹•Institutions (1)

Iona College¹

01 May 1994-Journal of Intelligent and Fuzzy Systems

TL;DR: This work develops, based upon the mountain clustering method, a procedure for learning fuzzy systems models from data, and uses a back propagation algorithm to tune the model.

...read moreread less

Abstract: We develop, based upon the mountain clustering method, a procedure for learning fuzzy systems models from data. First we discuss the mountain clustering method. We then show how it could be used to obtain the structure of fuzzy systems models. The initial estimates of this model are obtained from the cluster centers. We then use a back propagation algorithm to tune the model.

...read moreread less

670 citations

Proceedings Article•

Diversity and adaptation in populations of clustering ants

[...]

Erik D. Lumer, Baldo Faieta

01 Jul 1994

622 citations

Journal Article•DOI•

Approximate clustering via the mountain method

[...]

Ronald R. Yager¹, Dimitar Petrov Filev¹•Institutions (1)

Iona College¹

01 Aug 1994-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A simple and effective approach for approximate estimation of the cluster centers on the basis of the concept of a mountain function, based upon a griding on the space, the construction of amountain function from the data and then a destruction of the mountains to obtain the cluster center centers.

...read moreread less

Abstract: We develop a simple and effective approach for approximate estimation of the cluster centers on the basis of the concept of a mountain function. We call the procedure the mountain method. It can be useful for obtaining the initial values of the clusters that are required by more complex cluster algorithms. It also can be used as a stand alone simple approximate clustering technique. The method is based upon a griding on the space, the construction of a mountain function from the data and then a destruction of the mountains to obtain the cluster centers. >

...read moreread less

Book Chapter•DOI•

Prototype and feature selection by sampling and random mutation hill climbing algorithms

[...]

David B. Skalak¹•Institutions (1)

University of Massachusetts Amherst¹

10 Jul 1994

TL;DR: On four datasets, it is shown that only three or four prototypes sufficed to give predictive accuracy equal or superior to a basic nearest neighbor algorithm whose run-time storage costs were approximately 10 to 200 times greater.

...read moreread less

Abstract: With the goal of reducing computational costs without sacrificing accuracy, we describe two algorithms to find sets of prototypes for nearest neighbor classification. Here, the term “prototypes” refers to the reference instances used in a nearest neighbor computation — the instances with respect to which similarity is assessed in order to assign a class to a new data item. Both algorithms rely on stochastic techniques to search the space of sets of prototypes and are simple to implement. The first is a Monte Carlo sampling algorithm; the second applies random mutation hill climbing. On four datasets we show that only three or four prototypes sufficed to give predictive accuracy equal or superior to a basic nearest neighbor algorithm whose run-time storage costs were approximately 10 to 200 times greater. We briefly investigate how random mutation hill climbing may be applied to select features and prototypes simultaneously. Finally, we explain the performance of the sampling algorithm on these datasets in terms of a statistical measure of the extent of clustering displayed by the target classes.

...read moreread less

Book•

Inductive Learning Algorithms for Complex Systems Modeling

[...]

Hema R. Madala, Alekseĭ Grigorʹevich Ivakhnenko¹•Institutions (1)

Clarkson University¹

09 Jan 1994

TL;DR: Inductive Learning Algorithms for complex Systems Modeling is a professional monograph that surveys new types of learning algorithms for modelling complex scientific systems in science and engineering.

...read moreread less

Abstract: Introduction: Systems and Cybernetics. Inductive Learning Algorithms: Self-Organization Method. Network Structures. Long Term Quantitative Predictions. Dialogue Language Generalization. Noise Immunity and Convergence: Analogy with Information Theory. Classification and Analysis of Criteria. Improvement of Noise Immunity. Asymptotic Properties of Criteria. Balance Criterion of Predictions. Convergence of Algorithms. Physical Fields and Modeling: Finite-Difference Pattern Schemes. Comparative Studies. Cyclic Processes. Clusterization and Recognition: Self-Organization Modeling and Clustering. Methods of Self-Organization Clustering. Objective Computer Clustering Algorithm. Levels of Discretization and Balance Criterion. Forecasting Methods of Analogues. Applications: Fields of Application. Weather Modeling. Ecological System Studies. Modeling of Economical Systems. Agricultural System Studies. Modeling of Solar Activity. Inductive and Deductive Networks: Self-Organization Mechanism in the Networks. Network Techniques. Generalization. Comparison and Simulation Results. Basic Algorithms and Program Listings: Computational Aspects of Multilayered Algorithm. Computational Aspects of Combinatorial Algorithm. Computational Aspects of Harmonical Algorithm.

...read moreread less

Journal Article•DOI•

Spectral K-way ratio-cut partitioning and clustering

[...]

Pak K. Chan¹, Martine D. F. Schlag¹, Jason Zien¹•Institutions (1)

University of California, Santa Cruz¹

01 Sep 1994-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A spectral approach to multi-way ratio-cut partitioning that provides a generalization of the ratio- cut cost metric to L-way partitioning and a lower bound on this cost metric is developed.

...read moreread less

Abstract: Recent research on partitioning has focused on the ratio-cut cost metric, which maintains a balance between the cost of the edges cut and the sizes of the partitions without fixing the size of the partitions a priori. Iterative approaches and spectral approaches to two-way ratio-cut partitioning have yielded higher quality partitioning results. In this paper, we develop a spectral approach to multi-way ratio-cut partitioning that provides a generalization of the ratio-cut cost metric to L-way partitioning and a lower bound on this cost metric. Our approach involves finding the k smallest eigenvalue/eigenvector pairs of the Laplacian of the graph. The eigenvectors provide an embedding of the graph's n vertices into a k-dimensional subspace. We devise a time and space efficient clustering heuristic to coerce the points in the embedding into k partitions. Advancement over the current work is evidenced by the results of experiments on the standard benchmarks. >

...read moreread less

Journal Article•DOI•

Fuzzy Kohonen clustering networks

[...]

Eric Chen-Kuo Tsao¹, James C. Bezdek², Nikhil R. Pal²•Institutions (2)

National Chung Cheng University¹, University of West Florida²

01 May 1994-Pattern Recognition

TL;DR: A fuzzy Kohonen clustering network is proposed which integrates the Fuzzy c-Means (FCM) model into the learning rate and updating strategies of the Kohonen network, and the numerical results show improved convergence as well as reduced labeling errors.

...read moreread less

Journal Article•DOI•

A new initialization technique for generalized Lloyd iteration

[...]

Ioannis Katsavounidis¹, C.-C. Jay Kuo¹, Zhen Zhang•Institutions (1)

University of Southern California¹

01 Oct 1994-IEEE Signal Processing Letters

TL;DR: An efficient method is proposed to obtain a good initial codebook that can accelerate the convergence of the generalized Lloyd algorithm and achieve a better local minimum as well.

...read moreread less

Abstract: The generalized Lloyd algorithm plays an important role in the design of vector quantizers (VQ) and in feature clustering for pattern recognition. In the VQ context, this algorithm provides a procedure to iteratively improve a codebook and results in a local minimum that minimizes the average distortion function. We propose an efficient method to obtain a good initial codebook that can accelerate the convergence of the generalized Lloyd algorithm and achieve a better local minimum as well. >

...read moreread less

Proceedings Article•DOI•

Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering: (extended abstract)

[...]

Mary Inaba, Naoki Katoh, Hiroshi Imai

10 Jun 1994

TL;DR: The optimum solution to the k-clustering problem is characterized by the ordinary Euclidean Voronoi diagram and the weighted Vor onoi diagram with both multiplicative and additive weights.

...read moreread less

Abstract: In this paper we consider thek-clustering problem for a set S of n points i=(xi) in thed-dimensional space with variance-based errors as clustering criteria, motivated from the color quantization problem of computing a color lookup table for frame buffer display. As the inter-cluster criterion to minimize, the sum on intra-cluster errors over every cluster is used, and as the intra-cluster criterion of a cluster Sj,|Sj|α-1 ΣpiϵSj || xi - x(Sj)||2is considered, where ||·|| is the L2 norm and x(Sj) is the centroid of points in Sj, i.e., (1/|Sj|)Σp ∈Sjxi. The cases of α=1,2 correspond to the sum of squared errors and the all-pairs sum of squared errors, respectively.The k-clustering problem under the criterion with α=1,2 are treated in a unified manner by characterizing the optimum solution to the kclustering problem by the ordinary Euclidean Voronoi diagram and the weighted Voronoi diagram with both multiplicative and additive weights. With this framework, the problem is related to the generalized primary shutter function for the Voronoi diagrams. The primary shutter function is shown to be O(nO(kd)), which implies that, for fixed k, this clustering problem can be solved in a polynomial time. For the problem with the most typical intra-cluster criterion of the sum of squared errors, we also present an efficient randomized algorithm which, roughly speaking, finds an ∈–approximate 2–clustering in O(n(1/∈)d) time, which is quite practical and may be used to real large-scale problems such as the color quantization problem.

...read moreread less

Proceedings Article•DOI•

On the learnability of discrete distributions

[...]

Michael Kearns¹, Yishay Mansour², Dana Ron, Ronitt Rubinfeld³, Robert E. Schapire¹, Linda Sellie⁴ - Show less +2 more•Institutions (4)

Bell Labs¹, Tel Aviv University², Cornell University³, University of Chicago⁴

23 May 1994

TL;DR: A new model of learning probability distributions from independent draws is introduced, inspired by the popular Probably Approximately Correct (PAC) model for learning boolean functions from labeled examples, in the sense that it emphasizes efficient and approximate learning, and it studies the learnability of restricted classes of target distributions.

...read moreread less

Abstract: We introduce and investigate a new model of learning probability distributions from independent draws. Our model is inspired by the popular Probably Approximately Correct (PAC) model for learning boolean functions from labeled examples [24], in the sense that we emphasize efficient and approximate learning, and we study the learnability of restricted classes of target distributions. The dist ribut ion classes we examine are often defined by some simple computational mechanism for transforming a truly random string of input bits (which is not visible to the learning algorithm) into the stochastic observation (output) seen by the learning algorithm. In this paper, we concentrate on discrete distributions over {O, I}n. The problem of inferring an approximation to an unknown probability distribution on the basis of independent draws has a long and complex history in the pattern recognition and statistics literature. For instance, the problem of estimating the parameters of a Gaussian density in highdimensional space is one of the most studied statistical problems. Distribution learning problems have often been investigated in the context of unsupervised learning, in which a linear mixture of two or more distributions is generating the observations, and the final goal is not to model the distributions themselves, but to predict from which distribution each observation was drawn. Data clustering methods are a common tool here. There is also a large literature on nonpararnetric density estimation, in which no assumptions are made on the unknown target density. Nearest-neighbor approaches to the unsupervised learning problem often arise in the nonparametric setting. While we obviously cannot do justice to these areas here, the books of Duda and Hart [9] and Vapnik [25] provide excellent overviews and introductions to the pattern recognition work, as well as many pointers for further reading. See also Izenman’s recent survey article [16]. Roughly speaking, our work departs from the traditional statistical and pattern recognition approaches in two ways. First, we place explicit emphasis on the comput ationrd complexity of distribution learning. It seems fair to say that while previous research has provided an excellent understanding of the information-theoretic issues involved in dis-

...read moreread less

Journal Article•DOI•

Nerf c-means: Non-Euclidean relational fuzzy clustering

[...]

Richard J. Hathaway¹, James C. Bezdek²•Institutions (2)

Georgia Southern University¹, University of West Florida²

01 Mar 1994-Pattern Recognition

TL;DR: This paper substantially improves RFCM by generalizing it to the case of arbitrary (symmetric) dissimilarity data, and is applicable to any numerical relational data that are positive, reflexive (or anti-reflexive) and symmetric.

...read moreread less

Journal Article•DOI•

Random-effects regression models for clustered data with an example from smoking prevention research.

[...]

Donald Hedeker¹, Robert D. Gibbons, Brian R. Flay•Institutions (1)

University of Illinois at Chicago¹

01 Aug 1994-Journal of Consulting and Clinical Psychology

TL;DR: A random-effects regression model is proposed for analysis of clustered data and a maximum marginal likelihood solution is described, and available statistical software for the model is discussed.

...read moreread less

Abstract: A random-effects regression model is proposed for analysis of clustered data. Unlike ordinary regression analysis of clustered data, random-effects regression models do not assume that each observation is independent but do assume that data within clusters are dependent to some degree. The degree of this dependency is estimated along with estimates of the usual model parameters, thus adjusting these effects for the dependency resulting from the clustering of the data. A maximum marginal likelihood solution is described, and available statistical software for the model is discussed. An analysis of a dataset in which students are clustered within classrooms and schools is used to illustrate features of random-effects regression analysis, relative to both individual-level analysis that ignores the clustering of the data, and classroom-level analysis that aggregates the individual data.

...read moreread less

Proceedings Article•

Application of weighted Voronoi diagrams and randomization to variance-based k-clustering

[...]

M. Inaba

01 Jan 1994

TL;DR: This paper considers the k-clustering problem for a set S of n points pi = (~i) in the d-dimensional space with variance-based errors as clustering criteria, motivated from the color quantization problem of computing a color lookup table for frame buffer display.

...read moreread less

Abstract: In this paper we consider the k-clustering problem for a set S of n points pi = (~i) in the d-dimensional space with variance-based errors as clustering criteria, motivated from the color quantization problem of computing a color lookup table for frame buffer display. As the inter-cluster criterion to minimize, the sum of intracluster errors over every cluster is used, and as the int racluster criterion of a cluster Sj,

...read moreread less

Report•DOI•

Learning from Incomplete Data

[...]

Zoubin Ghahramani, Michael I. Jordan

01 Dec 1994

TL;DR: A set of algorithms are described that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner that make two distinct appeals to the Expectation-Maximization principle.

...read moreread less

Abstract: Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation

...read moreread less

Journal Article•DOI•

Capacitated clustering problems by hybrid simulated annealing and tabu search

[...]

Ibrahim H. Osman¹, Nicos Christofides²•Institutions (2)

University of Kent¹, University of London²

01 Jul 1994-International Transactions in Operational Research

TL;DR: In this article, a simple constructive heuristic, a λ-interchange generation mechanism, a hybrid simulated annealing (SA) and tabu search (TS) algorithm which has computationally desirable features using a new non-monotonic cooling schedule, are developed.

...read moreread less

Journal Article•DOI•

MRI segmentation using fuzzy clustering techniques

[...]

M.C. Clark¹, Lawrence O. Hall, Dmitry B. Goldgof, Laurence P. Clarke, R.P. Velthuizen, Martin S. Silbiger - Show less +2 more•Institutions (1)

University of South Florida¹

01 Nov 1994-IEEE Engineering in Medicine and Biology Magazine

TL;DR: The system described here is an attempt to provide completely automatic segmentation and labeling of normal volunteer brains and the absolute accuracy of the segmentations has not yet been rigorously established.

...read moreread less

Abstract: The authors' main contribution is to build upon their earlier efforts by expanding the tissue model concept to cover a brain volume. Furthermore, processing time is reduced and accuracy is enhanced by the use of knowledge propagation, where information derived from one slice is made available to succeeding slices as additional knowledge. The system is organized as follows. Each MR slice is initially segmented by an unsupervised fuzzy c-means clustering algorithm. Next, an expert system uses model-based recognition techniques to locate a landmark, called a focus-of attention tissue. Qualitative models of slices of brain tissue are defined and matched with their instances from imaged slices. If a significant deformation is detected in a tissue, the slice is classified to be abnormal and volume processing halts. Otherwise, the expert system locates the next focus-of-attention tissue, based on a hierarchy of expected tissues. This process is repeated until either a slice is classified as abnormal or all tissues of the slice are labeled. If the slice is determined to be abnormal, the entire volume is also considered abnormal and processing halts. Otherwise, the system will proceed to the next slice and repeat the classification steps until all slices that comprise the volume are processed. A rule-based expert system tool, CLIPS, is used to organize the system. Low level modules for image processing and high level modules for image analysis, all written in the C language, are called as actions from the right hand sides of the rules. The system described here is an attempt to provide completely automatic segmentation and labeling of normal volunteer brains. The absolute accuracy of the segmentations has not yet been rigorously established. The relative accuracy appears acceptable. Efforts have been made to segment an entire volume (rather than merging a set of segmented slices) using supervised pattern recognition techniques or unsupervised fuzzy clustering. However, there is sometimes enough data nonuniformity between slices to prevent satisfactory segmentation. >

...read moreread less

Journal Article•DOI•

Cluster analysis of molecular conformations

[...]

Peter S. Shenkin¹, D. Quentin McDonald¹•Institutions (1)

Columbia University¹

01 Aug 1994-Journal of Computational Chemistry

TL;DR: A method for locating clusters of geometrically similar conformers in ensembles of chemical conformations is described, which first calculates the pairwise interconformational distance matrix in either torsional or Cartesian space and uses an agglomerative, single‐link clustering method to define a hierarchy of clusterings in the same space.

...read moreread less

Abstract: We describe a method for locating clusters of geometrically similar conformers in ensembles of chemical conformations. We first calculate the pairwise interconformational distance matrix in either torsional or Cartesian space and then use an agglomerative, single-link clustering method to define a hierarchy of clusterings in the same space. Especially good clusterings are distinguished by high values of the separation ratio: the ratio of the shortest intercluster distance to the characteristic threshold distance defining the clustering. We also discuss other statistics. The method has been embodied in a program called XCluster, which can display the distance matrix, the hierarchy of clusterings, and the clustering statistics in a variety of formats. XCluster can also write out the clustered conformations for subsequent or simultaneous viewing with a molecular visualization program. We demonstrate the sorts of insight that this approach affords with examples obtained from conformational search and molecular dynamics procedures. © 1994 by John Wiley & Sons, Inc.

...read moreread less

Proceedings Article•DOI•

A clustering algorithm for radiosity in complex environments

[...]

Brian Smits¹, James Arvo¹, Donald P. Greenberg¹•Institutions (1)

Cornell University¹

24 Jul 1994

TL;DR: The clustering algorithm presented here operates by estimating energy transfer between collections of objects while maintaining reliable error bounds on each transfer, and has obtained speedups of two orders of magnitude for environments of moderate complexity while maintaining comparable accuracy.

...read moreread less

Abstract: We present an approach for accelerating hierarchical radiosity by clustering objects. Previous approaches constructed effective hierarchies by subdividing surfaces, but could not exploit a hierarchical grouping on existing surfaces. This limitation resulted in an excessive number of initial links in complex environments. Initial linking is potentially the most expensive portion of hierarchical radiosity algorithms, and constrains the complexity of the environments that can be simulated. The clustering algorithm presented here operates by estimating energy transfer between collections of objects while maintaining reliable error bounds on each transfer. Two methods of bounding the transfers are employed with different tradeoffs between accuracy and time. In contrast with the O(s2) time and space complexity of the initial linking in previous hierarchical radiosity algorithms, the new methods have complexities of O(slogs) and O(s) for both time and space. Using these methods we have obtained speedups of two orders of magnitude for environments of moderate complexity while maintaining comparable accuracy.

...read moreread less

Journal Article•DOI•

Relevance of dynamic clustering to biological networks

[...]

Kunihiko Kaneko¹•Institutions (1)

University of Tokyo¹

01 Aug 1994-Physica D: Nonlinear Phenomena

TL;DR: Relevance of the clustering to ecological, immune, neural, and cellular networks is discussed, with the emphasis on partially ordered states with chaotic itinerancy, and an extension allowing for the growth of the number of elements is given in connection with cell differentiation.

...read moreread less

Proceedings Article•DOI•

Three-dimensional registration using range and intensity information

[...]

Guy Godin¹, Marc Rioux¹, Rejean Baribeau²•Institutions (2)

National Research Council¹, Canadian Conservation Institute²

06 Oct 1994

TL;DR: This paper takes advantage of the ability of many active optical range sensors to record intensity or even color in addition to the range information to improve the registration procedure by constraining potential matches between pairs of points based on a similarity measure derived from the intensity information.

...read moreread less

Abstract: The determination of relative pose between two range images, also called registration, is a ubiquitous problem in computer vision, for geometric model building as well as dimensional inspection. The method presented in this paper takes advantage of the ability of many active optical range sensors to record intensity or even color in addition to the range information. This information is used to improve the registration procedure by constraining potential matches between pairs of points based on a similarity measure derived from the intensity information. One difficulty in using the intensity information is its dependence on the measuring conditions such as distance and orientation. The intensity or color information must first be converted into a viewpoint-independent feature. This can be achieved by inverting an illumination model, by differential feature measurements or by simple clustering. Following that step, a robust iterative closest point method is then used to perform the pose determination. Using the intensity can help to speed up convergence or, in cases of remaining degrees of freedom (e.g. on images of a sphere), to additionally constrain the match. The paper will describe the algorithmic framework and provide examples using range-and-color images.

...read moreread less

Book•DOI•

New Approaches in Classification and Data Analysis

[...]

Martin Schader, Edwin Diday, Yves Lechevallier, Patrice Bertrand, Bernard Burtschy - Show less +1 more

01 Jan 1994

TL;DR: Clusters and factors: neural algorithms for a novel representation of huge and highly multidimensional data sets and a generalisation of the diameter criterion for clustering.

...read moreread less

Abstract: Classification and Clustering: Problems for the Future.- From classifications to cognitive categorization: the example of the road lexicon.- A review of graphical methods in Japan-from histogram to dynamic display.- New Data and New Tools: A Hypermedia Environment for Navigating Statistical Knowledge in Data Science.- On the logical necessity and priority of a monothetic conception of class, and on the consequent inadequacy of polythetic accounts of category and categorization.- Research and Applications of Quantification Methods in East Asian Countries.- Algorithms for a geometrical P.C.A. with the L1-norm.- Comparison of hierarchical classifications.- On quadripolar Robinson dissimilarity matrices.- An Ordered Set Approach to Neutral Consensus Functions.- From Apresjan Hierarchies and Bandelt-Dress Weak hierarchies to Quasi-hierarchies.- Spanning trees and average linkage clustering.- Adjustments of tree metrics based on minimum spanning trees.- The complexity of the median procedure for binary trees.- A multivariate analysis of a series of variety trials with special reference to classification of varieties.- Quality control of mixture. Application: The grass.- Mixture Analysis with Noisy Data.- Locally optimal tests on spatial clustering.- Choosing the Number of Clusters, Subset Selection of Variables, and Outlier Detection in the Standard Mixture-Model Cluster Analysis.- An examination of procedures for determining the number of clusters in a data set.- The gap test: an optimal method for determining the number of natural classes in cluster analysis.- Mode detection and valley seeking by binary morphological analysis of connectivity for pattern classification.- Interactive Class Classification Using Types.- K-means clustering in a low-dimensional Euclidean space.- Complexity relaxation of dynamic programming for cluster analysis.- Partitioning Problems in Cluster Analysis: A Review of Mathematical Programming Approaches.- Clusters and factors: neural algorithms for a novel representation of huge and highly multidimensional data sets.- Graphs and structural similarities.- A generalisation of the diameter criterion for clustering.- Percolation and multimodal data structuring.- Classification and Discrimination Techniques Applied to the Early Detection of Business Failure.- Recursive Partition and Symbolic Data Analysis.- Interpretation Tools For Generalized Discriminant Analysis.- Inference about rejected cases in discriminant analysis.- Structure Learning of Bayesian Networks by Genetic Algorithms.- On the representation of observational data used for classification and identification of natural objects.- Alternative strategies and CATANOVA testing in two-stage binary segmentation.- Alignment, Comparison and Consensus of Molecular Sequences.- An Empirical Evaluation of Consensus Rules for Molecular Sequences.- A Probabilistic Approach To Identifying Consensus In Molecular Sequences.- Applications of Distance Geometry to Molecular Conformation.- Classification of aligned biological sequences.- Use of Pyramids in Symbolic Data Analysis.- Proximity Coefficients between Boolean symbolic objects.- Conceptual Clustering in Structured Domains: A Theory Guided Approach.- Automatic Aid to Symbolic Cluster Interpretation.- Symbolic Clustering Algorithms using Similarity and Dissimilarity Measures.- Feature Selection for Symbolic Data Classification.- Towards extraction method of knowledge founded by symbolic objects.- One Method of Classification based on an Analysis of the Structural Relationship between Independent Variables.- The Integration of Neural Networks with Symbolic Knowledge Processing.- Ordering of Fuzzy k-Partitions.- On the Extension of Probability Theory and Statistics to the Handling of Fuzzy Data.- Fuzzy Regression.- Clustering and Aggregation of Fuzzy Preference Data: Agreement vs. Information.- Rough Classification with Valued Closeness Relation.- Representing proximities by network models.- An Eigenvector Algorithm to Fit lp-Distance Matrices.- A non linear approach to Non Symmetrical Data Analysis.- An Algorithmic Approach to Bilinear Models for Two-Way Contingency Tables.- New Approaches Based on Rankings in Sensory Evaluation.- Estimating failure times distributions from censored systems arranged in series.- Calibration Used as a Nonresponse Adjustment.- Least Squares Smoothers and Additive Decomposition.- High Dimensional Representations and Information Retrieval.- Experiments of Textual Data Analysis at Electricite de France.- Conception of a Data Supervisor in the Prospect of Piloting Management Quality of Service and Marketing.- Discriminant Analysis Using Textual Data.- Recent Developments in Case Based Reasoning: Improvements of Similarity Measures.- Contiguity in discriminant factorial analysis for image clustering.- Exploratory and Confirmatory Discrete Multivariate Analysis in a Probabilistic Approach for Studying the Regional Distribution of Aids in Angola.- Factor Analysis of Medical Image Sequences (FAMIS): Fundamental principles and applications.- Multifractal Segmentation of Medical Images.- The Human Organism-a Place to Thrive for the Immuno-Deficiency Virus.- Comparability and usefulness of newer and classical data analysis techniques. Application in medical domain classification.- The Classification of IRAS Point Sources.- Astronomical classification of the Hipparcos input catalogue.- Group identification and individual assignation of stars from kinematical and luminosity parameters.- Specific numerical and symbolic analysis of chronological series in view to classification of long period variable stars.- Author and Subject Index.

...read moreread less

Patent•

Automatic method for scoring and clustering prototypes of handwritten stroke-based data

[...]

Kamil A. Grajski¹, Yen-Lu Chow¹•Institutions (1)

Apple Inc.¹

02 Sep 1994

TL;DR: In this paper, a system and method for processing stroke-based handwriting data for the purposes of automatically scoring and clustering the handwritten data to form letter prototypes is presented, where each character is represented by a plurality of mathematical feature vectors and each one of the plurality of feature vectors is labelled as corresponding to a particular character in the character strings.

...read moreread less

Abstract: A system and method for processing stroke-based handwriting data for the purposes of automatically scoring and clustering the handwritten data to form letter prototypes. The present invention includes a method for processing digitized stroke-based handwriting data of known character strings, where each of the character strings is represented by a plurality of mathematical feature vectors. In this method, each one of the plurality of feature vectors is labelled as corresponding to a particular character in the character strings. A trajectory is then formed for each one of the plurality of feature vectors labelled as corresponding to a particular character. After the trajectories are formed, a distance value is calculated for each pair of trajectories corresponding to the particular character using dynamic time warping method. The trajectories which are within a sufficiently small distance of each other are grouped to form a plurality of clusters. The clusters are used to define handwriting prototypes which identify subcategories of the character.

...read moreread less

Showing papers on "Cluster analysis published in 1994"