Showing papers on "Cluster analysis published in 1993"

PDF

Open Access

Journal Article•DOI•

[...]

Raghu Krishnapuram¹, James M. Keller¹•Institutions (1)

01 May 1993-IEEE Transactions on Fuzzy Systems

TL;DR: An appropriate objective function whose minimum will characterize a good possibilistic partition of the data is constructed, and the membership and prototype update equations are derived from necessary conditions for minimization of the criterion function.

...read moreread less

Abstract: The clustering problem is cast in the framework of possibility theory. The approach differs from the existing clustering methods in that the resulting partition of the data can be interpreted as a possibilistic partition, and the membership values can be interpreted as degrees of possibility of the points belonging to the classes, i.e., the compatibilities of the points with the class prototypes. An appropriate objective function whose minimum will characterize a good possibilistic partition of the data is constructed, and the membership and prototype update equations are derived from necessary conditions for minimization of the criterion function. The advantages of the resulting family of possibilistic algorithms are illustrated by several examples. >

...read moreread less

2,388 citations

Journal Article•DOI•

Model-based Gaussian and non-Gaussian clustering

[...]

Jeffrey D. Banfield, Adrian E. Raftery

01 Sep 1993-Biometrics

TL;DR: The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967), but it is restricted to Gaussian distributions and it does not allow for noise.

...read moreread less

Abstract: : The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967). However, as currently implemented, it does not allow the specification of which features (orientation, size and shape) are to be common to all clusters and which may differ between clusters. Also, it is restricted to Gaussian distributions and it does not allow for noise. We propose ways of overcoming these limitations. A reparameterization of the covariance matrix allows us to specify that some features, but not all, be the same for all clusters. A practical framework for non-Gaussian clustering is outlined, and a means of incorporating noise in the form of a Poisson process is described. An approximate Bayesian method for choosing the number of clusters is given. The performance of the proposed methods is studied by simulation, with encouraging results. The methods are applied to the analysis of a data set arising in the study of diabetes, and the results seem better than those of previous analyses. (RH)

...read moreread less

2,336 citations

Journal Article•DOI•

'Neural-gas' network for vector quantization and its application to time-series prediction

[...]

Thomas Martinetz¹, S.G. Berkovich¹, Klaus Schulten¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Jul 1993-IEEE Transactions on Neural Networks

TL;DR: It is shown that the dynamics of the reference (weight) vectors during the input-driven adaptation procedure are determined by the gradient of an energy function whose shape can be modulated through a neighborhood determining parameter and resemble the dynamicsof Brownian particles moving in a potential determined by a data point density.

...read moreread less

Abstract: A neural network algorithm based on a soft-max adaptation rule is presented. This algorithm exhibits good performance in reaching the optimum minimization of a cost function for vector quantization data compression. The soft-max rule employed is an extension of the standard K-means clustering procedure and takes into account a neighborhood ranking of the reference (weight) vectors. It is shown that the dynamics of the reference (weight) vectors during the input-driven adaptation procedure are determined by the gradient of an energy function whose shape can be modulated through a neighborhood determining parameter and resemble the dynamics of Brownian particles moving in a potential determined by the data point density. The network is used to represent the attractor of the Mackey-Glass equation and to predict the Mackey-Glass time series, with additional local linear mappings for generating output values. The results obtained for the time-series prediction compare favorably with the results achieved by backpropagation and radial basis function networks. >

...read moreread less

1,504 citations

Journal Article•DOI•

Shuffled complex evolution approach for effective and efficient global minimization

[...]

Qingyun Duan¹, Vijai Kumar Gupta¹, Soroosh Sorooshian¹•Institutions (1)

University of Arizona¹

01 Mar 1993-Journal of Optimization Theory and Applications

TL;DR: This paper discusses five of these characteristics and presents a strategy for function optimization called the shuffled complex evolution (SCE) method, which promises to be robust, effective, and efficient for a broad class of problems.

...read moreread less

Abstract: The degree of difficulty in solving a global optimization problem is in general dependent on the dimensionality of the problem and certain characteristics of the objective function. This paper discusses five of these characteristics and presents a strategy for function optimization called the shuffled complex evolution (SCE) method, which promises to be robust, effective, and efficient for a broad class of problems. The SCE method is based on a synthesis of four concepts that have proved successful for global optimization: (a) combination of probabilistic and deterministic approaches; (b) clustering; (c) systematic evolution of a complex of points spanning the space, in the direction of global improvement; and (d) competitive evolution. Two algorithms based on the SCE method are presented. These algorithms are tested by running 100 randomly initiated trials on eight test problems of differing difficulty. The performance of the two algorithms is compared to that of the controlled random search CRS2 method presented by Price (1983, 1987) and to a multistart algorithm based on the simplex method presented by Nelder and Mead (1965).

...read moreread less

1,481 citations

Journal Article•DOI•

An optimal graph theoretic approach to data clustering: theory and its application to image segmentation

[...]

Z. Wu¹, Richard M. Leahy²•Institutions (2)

University of Pennsylvania¹, University of Southern California²

01 Nov 1993-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel graph theoretic approach for data clustering is presented and its application to the image segmentation problem is demonstrated, resulting in an optimal solution equivalent to that obtained by partitioning the complete equivalent tree and is able to handle very large graphs with several hundred thousand vertices.

...read moreread less

Abstract: A novel graph theoretic approach for data clustering is presented and its application to the image segmentation problem is demonstrated. The data to be clustered are represented by an undirected adjacency graph G with arc capacities assigned to reflect the similarity between the linked vertices. Clustering is achieved by removing arcs of G to form mutually exclusive subgraphs such that the largest inter-subgraph maximum flow is minimized. For graphs of moderate size ( approximately 2000 vertices), the optimal solution is obtained through partitioning a flow and cut equivalent tree of G, which can be efficiently constructed using the Gomory-Hu algorithm (1961). However for larger graphs this approach is impractical. New theorems for subgraph condensation are derived and are then used to develop a fast algorithm which hierarchically constructs and partitions a partially equivalent tree of much reduced size. This algorithm results in an optimal solution equivalent to that obtained by partitioning the complete equivalent tree and is able to handle very large graphs with several hundred thousand vertices. The new clustering algorithm is applied to the image segmentation problem. The segmentation is achieved by effectively searching for closed contours of edge elements (equivalent to minimum cuts in G), which consist mostly of strong edges, while rejecting contours containing isolated strong edges. This method is able to accurately locate region boundaries and at the same time guarantees the formation of closed edge contours. >

...read moreread less

1,223 citations

Journal Article•DOI•

Longitudinally-invariant k ⊥ -clustering algorithms for hadron-hadron collisions

[...]

Stefano Catani¹, Yuri L. Dokshitzer², Yuri L. Dokshitzer³, Michael H. Seymour², Bryan R. Webber¹ - Show less +1 more•Institutions (3)

CERN¹, Lund University², Petersburg Nuclear Physics Institute³

27 Sep 1993-Nuclear Physics

TL;DR: In this article, a version of the QCD-motivated "k⊥" jet-clustering algorithm for hadron-hadron collisions is proposed, which is invariant under boosts along the beam directions.

...read moreread less

1,130 citations

Proceedings Article•DOI•

Distributional clustering of english words

[...]

Fernando Pereira¹, Naftali Tishby², Lillian Lee³•Institutions (3)

Bell Labs¹, Hebrew University of Jerusalem², Cornell University³

22 Jun 1993

TL;DR: In this article, a method for clustering words according to their distribution in particular syntactic contexts is described and evaluated experimentally, where words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the similarity measure for word clustering.

...read moreread less

Abstract: We describe and evaluate experimentally a method for clustering words according to their distribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the similarity measure for clustering. Clusters are represented by average context distributions derived from the given words according to their probabilities of cluster membership. In many cases, the clusters can be thought of as encoding coarse sense distinctions. Deterministic annealing is used to find lowest distortion sets of clusters: as the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft" clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.

...read moreread less

1,042 citations

Journal Article•DOI•

Review of MR image segmentation techniques using pattern recognition.

[...]

James C. Bezdek¹, Lawrence O. Hall², Laurence P. Clarke²•Institutions (2)

University of West Florida¹, University of South Florida²

01 Jul 1993-Medical Physics

TL;DR: This paper has reviewed, with somewhat variable coverage, the nine MR image segmentation techniques itemized in Table II; each has its merits and drawbacks.

...read moreread less

Abstract: This paper has reviewed, with somewhat variable coverage, the nine MR image segmentation techniques itemized in Table II. A wide array of approaches have been discussed; each has its merits and drawbacks. We have also given pointers to other approaches not discussed in depth in this review. The methods reviewed fall roughly into four model groups: c-means, maximum likelihood, neural networks, and k-nearest neighbor rules. Both supervised and unsupervised schemes require human intervention to obtain clinically useful results in MR segmentation. Unsupervised techniques require somewhat less interaction on a per patient/image basis. Maximum likelihood techniques have had some success, but are very susceptible to the choice of training region, which may need to be chosen slice by slice for even one patient. Generally, techniques that must assume an underlying statistical distribution of the data (such as LML and UML) do not appear promising, since tissue regions of interest do not usually obey the distributional tendencies of probability density functions. The most promising supervised techniques reviewed seem to be FF/NN methods that allow hidden layers to be configured as examples are presented to the system. An example of a self-configuring network, FF/CC, was also discussed. The relatively simple k-nearest neighbor rule algorithms (hard and fuzzy) have also shown promise in the supervised category. Unsupervised techniques based upon fuzzy c-means clustering algorithms have also shown great promise in MR image segmentation. Several unsupervised connectionist techniques have recently been experimented with on MR images of the brain and have provided promising initial results. A pixel-intensity-based edge detection algorithm has recently been used to provide promising segmentations of the brain. This is also an unsupervised technique, older versions of which have been susceptible to oversegmenting the image because of the lack of clear boundaries between tissue types or finding uninteresting boundaries between slightly different types of the same tissue. To conclude, we offer some remarks about improving MR segmentation techniques. The better unsupervised techniques are too slow. Improving speed via parallelization and optimization will improve their competitiveness with, e.g., the k-nn rule, which is the fastest technique covered in this review. Another area for development is dynamic cluster validity. Unsupervised methods need better ways to specify and adjust c, the number of tissue classes found by the algorithm. Initialization is a third important area of research. Many of the schemes listed in Table II are sensitive to good initialization, both in terms of the parameters of the design, as well as operator selection of training data.(ABSTRACT TRUNCATED AT 400 WORDS)

...read moreread less

1,036 citations

Journal Article•DOI•

Random and cooperative sequential adsorption

[...]

James W. Evans¹•Institutions (1)

Iowa State University¹

01 Oct 1993-Reviews of Modern Physics

TL;DR: In this paper, the authors review the detailed understanding of asymptotic kinetics, spatial correlations, percolative structure, etc., which is emerging for these far-from-equilibrium processes.

...read moreread less

Abstract: Irreversible random sequential adsorption (RSA) on lattices, and continuum "car parking" analogues, have long received attention as models for reactions on polymer chains, chemisorption on single-crystal surfaces, adsorption in colloidal systems, and solid state transformations. Cooperative generalizations of these models (CSA) are sometimes more appropriate, and can exhibit richer kinetics and spatial structure, e.g., autocatalysis and clustering. The distribution of filled or transformed sites in RSA and CSA is not described by an equilibrium Gibbs measure. This is the case even for the saturation "jammed" state of models where the lattice or space cannot fill completely. However exact analysis is often possible in one dimension, and a variety of powerful analytic methods have been developed for higher dimensional models. Here we review the detailed understanding of asymptotic kinetics, spatial correlations, percolative structure, etc., which is emerging for these far-from-equilibrium processes.

...read moreread less

898 citations

Journal Article•DOI•

A clustering technique for digital communications channel equalization using radial basis function networks

[...]

Sheng Chen¹, Bernard Mulgrew¹, Peter Grant¹•Institutions (1)

University of Edinburgh¹

01 Jul 1993-IEEE Transactions on Neural Networks

TL;DR: It is shown that the radial basis function network has an identical structure to the optimal Bayesian symbol-decision equalizer solution and, therefore, can be employed to implement the Bayesian equalizer.

...read moreread less

Abstract: The application of a radial basis function network to digital communications channel equalization is examined. It is shown that the radial basis function network has an identical structure to the optimal Bayesian symbol-decision equalizer solution and, therefore, can be employed to implement the Bayesian equalizer. The training of a radial basis function network to realize the Bayesian equalization solution can be achieved efficiently using a simple and robust supervised clustering algorithm. During data transmission a decision-directed version of the clustering algorithm enables the radial basis function network to track a slowly time-varying environment. Moreover, the clustering scheme provides an automatic compensation for nonlinear channel and equipment distortion. Computer simulations are included to illustrate the analytical results. >

...read moreread less

794 citations

Journal Article•DOI•

Rival penalized competitive learning for clustering analysis, RBF net, and curve detection

[...]

Lei Xu¹, Adam Krzyżak², Erkki Oja³•Institutions (3)

Harvard University¹, Concordia University², Lappeenranta University of Technology³

01 Jul 1993-IEEE Transactions on Neural Networks

TL;DR: Experimental results show that RPCL outperforms FSCL when used for unsupervised classification, for training a radial basis function (RBF) network, and for curve detection in digital images.

...read moreread less

Abstract: It is shown that frequency sensitive competitive learning (FSCL), one version of the recently improved competitive learning (CL) algorithms, significantly deteriorates in performance when the number of units is inappropriately selected. An algorithm called rival penalized competitive learning (RPCL) is proposed. In this algorithm, not only is the winner unit modified to adapt to the input for each input, but its rival (the 2nd winner) is delearned by a smaller learning rate. RPCL can be regarded as an unsupervised extension of Kohonen's supervised LVQ2. RPCL has the ability to automatically allocate an appropriate number of units for an input data set. The experimental results show that RPCL outperforms FSCL when used for unsupervised classification, for training a radial basis function (RBF) network, and for curve detection in digital images. >

...read moreread less

Journal Article•DOI•

Switching regression models and fuzzy clustering

[...]

Richard J. Hathaway¹, James C. Bezdek²•Institutions (2)

Georgia Southern University¹, University of West Florida²

01 Aug 1993-IEEE Transactions on Fuzzy Systems

TL;DR: A family of objective functions called fuzzy c-regression models, which can be used too fit switching regression models to certain types of mixed data, is presented and a general optimization approach is given and corresponding theoretical convergence results are discussed.

...read moreread less

Abstract: A family of objective functions called fuzzy c-regression models, which can be used too fit switching regression models to certain types of mixed data, is presented. Minimization of particular objective functions in the family yields simultaneous estimates for the parameters of c regression models, together with a fuzzy c-partitioning of the data. A general optimization approach for the family of objective functions is given and corresponding theoretical convergence results are discussed. The approach is illustrated by two numerical examples that show how it can be used to fit mixed data to coupled linear and nonlinear models. >

...read moreread less

Journal Article•DOI•

Generalized clustering networks and Kohonen's self-organizing scheme

[...]

Nikhil R. Pal¹, James C. Bezdek¹, Eric Chen-Kuo Tsao¹•Institutions (1)

University of West Florida¹

01 Jul 1993-IEEE Transactions on Neural Networks

TL;DR: A generalization of LVQ that updates all nodes for a given input vector that is generally insensitive to initialization and independent of any choice of learning coefficient is proposed.

...read moreread less

Abstract: The relationship between the sequential hard c-means (SHCM) and learning vector quantization (LVQ) clustering algorithms is discussed. The impact and interaction of these two families of methods with Kohonen's self-organizing feature mapping (SOFM), which is not a clustering method but often lends ideas to clustering algorithms, are considered. A generalization of LVQ that updates all nodes for a given input vector is proposed. The network attempts to find a minimum of a well-defined objective function. The learning rules depend on the degree of distance match to the winner node; the lesser the degree of match with the winner, the greater the impact on nonwinner nodes. Numerical results indicate that the terminal prototypes generated by this modification of LVQ are generally insensitive to initialization and independent of any choice of learning coefficient. IRIS data obtained by E. Anderson's (1939) is used to illustrate the proposed method. Results are compared with the standard LVQ approach. >

...read moreread less

Journal Article•DOI•

Simulations of dissipative galaxy formation in hierarchically clustering universes – I: Tests of the code

[...]

Julio F. Navarro¹, Julio F. Navarro², Simon D. M. White¹•Institutions (2)

University of Cambridge¹, Durham University²

15 Nov 1993-Monthly Notices of the Royal Astronomical Society

TL;DR: In this article, the authors present tests of a code designed to simulate the evolution of self-gravitating fluids in 3D. The code is based on the smoothed-particle hydrodynamics (SPH) technique for solving the hydroynamical equations, together with a binary tree method for computing gravitational forces.

...read moreread less

Abstract: We present tests of a code designed to simulate the evolution of self-gravitating fluids in three dimensions. The code is based on the smoothed-particle hydrodynamics (SPH) technique for solving the hydrodynamical equations, together with a binary tree method for computing gravitational forces. Our tests are relevant to the evolution of non-linear structure in hierarchically clustering universes. In particular, we study the collapse and merger of quasi-spherical, slowly rotating clumps containing a mixture of gas and collisionless dark matter. For self-similar spherical collapse the detailed solution structure is already known, but we also study more realistic situations for which this is not the case

...read moreread less

Journal Article•DOI•

Climate zones of the conterminous United States defined using cluster analysis

[...]

Robert G. Fovell¹, Mei-Ying C. Fovell¹•Institutions (1)

University of California, Los Angeles¹

01 Nov 1993-Journal of Climate

TL;DR: A regionalization of the conterminous United States is accomplished using hierarchical cluster analysis on temperature and precipitation data using a set of candidate clustering levels, from which the 14, 25, and 8-duster solutions are chosen.

...read moreread less

Abstract: A regionalization of the conterminous United States is accomplished using hierarchical cluster analysis on temperature and precipitation data. The “best” combination of clustering method and data preprocessing strategy yields a set of candidate clustering levels, from which the 14-, 25-, and 8-duster solutions are chosen. Collectively, these are termed the “reference clusterings.” At the 14-cluster level, the bulk of the nation is partitioned into four principal climate zones: the Southeast, East Central, Northeastern Tier, and Interior West clusters. Many small clusters are concentrated in the Pacific Northwest. The 25-cluster solution can be used to identify the subzones within the 14 clusters. At that more detailed level, many of the areally more extensive clusters are partitioned into smaller, more internally cohesive subgroups. The “best” clustering approach is the one that minimizes the influences of three forms of bias-methodological, latent, and information-for the dataset at hand. Source...

...read moreread less

Proceedings Article•DOI•

Spectral K-Way Ratio-Cut Partitioning and Clustering

[...]

Pak K. Chan¹, Martine D. F. Schlag¹, Jason Zien¹•Institutions (1)

University of California, Santa Cruz¹

01 Jul 1993

TL;DR: A spectral approach to multiway ratio-cut partitioning is developed which provides a generalization of the ratio- cut cost metric to k-way partitioning and a lower bound on this cost metric.

...read moreread less

Abstract: Recent research on partitioning has focussed on the ratio-cut cost metric which maintains a balance between the sizes of the edges cut and the sizes of the partitions without fixing the size of the partitions a priori. Iterative approaches and spectral approaches to two-way ratio-cut partitioning have yielded higher quality partitioning results. In this paper we develop a spectral approach to multiway ratio-cut partitioning which provides a generalization of the ratio-cut cost metric to k-way partitioning and a lower bound on this cost metric. Our approach involves finding the k smallest eigenvalue/eigenvector pairs of the Laplacian of the graph. The eigenvectors provide an embedding of the graph's n vertices into a k-dimensional subspace. We devise a time and space efficient clustering heuristic to coerce the points in the embedding into k partitions. Advancement over the current work is evidenced by the results of experiments on the standard benchmarks.

...read moreread less

Journal Article•DOI•

Procedures for the Identification of Multiple Outliers in Linear Models

[...]

Ali S. Hadi¹, Jeffrey S. Simonoff²•Institutions (2)

Cornell University¹, New York University²

01 Dec 1993-Journal of the American Statistical Association

TL;DR: In this paper, the authors introduce two test procedures for the detection of multiple outliers that appear to be less sensitive to the observations they are supposed to identify, and compare them with various existing methods.

...read moreread less

Abstract: We consider the problem of identifying and testing multiple outliers in linear models. The available outlier identification methods often do not succeed in detecting multiple outliers because they are affected by the observations they are supposed to identify. We introduce two test procedures for the detection of multiple outliers that appear to be less sensitive to this problem. Both procedures attempt to separate the data into a set of “clean” data points and a set of points that contain the potential outliers. The potential outliers are then tested to see how extreme they are relative to the clean subset, using an appropriately scaled version of the prediction error. The procedures are illustrated and compared to various existing methods, using several data sets known to contain multiple outliers. Also, the performances of both procedures are investigated by a Monte Carlo study. The data sets and the Monte Carlo indicate that both procedures are effective in the detection of multiple outliers ...

...read moreread less

Journal Article•DOI•

On the granularity and clustering of directed acyclic task graphs

[...]

Apostolos Gerasoulis¹, Tao Yang¹•Institutions (1)

Rutgers University¹

01 Jun 1993-IEEE Transactions on Parallel and Distributed Systems

TL;DR: It is proved that every nonlinear clustering of a coarse grain DAG can be transformed into a linear clustering that has less or equal parallel time than the nonlinear one.

...read moreread less

Abstract: The authors consider the impact of the granularity on scheduling task graphs. Scheduling consists of two parts, the processors assignment of tasks, also called clustering, and the ordering of tasks for execution in each processor. The authors introduce two types of clusterings: nonlinear and linear clusterings. A clustering is nonlinear if two parallel tasks are mapped in the same cluster otherwise it is linear. Linear clustering fully exploits the natural parallelism of a given directed acyclic task graph (DAG) while nonlinear clustering sequentializes independent tasks to reduce parallelism. The authors also introduce a new quantification of the granularity of a DAG and define a coarse grain DAG as the one whose granularity is greater than one. It is proved that every nonlinear clustering of a coarse grain DAG can be transformed into a linear clustering that has less or equal parallel time than the nonlinear one. This result is used to prove the optimality of some important linear clusterings used in parallel numerical computing. >

...read moreread less

Proceedings Article•DOI•

Scalable performance analysis: the Pablo performance analysis environment

[...]

Daniel A. Reed¹, P.C. Roth¹, Ruth A. Aydt¹, K.A. Shields¹, L.F. Tavera¹, R.J. Noe¹, B.W. Schwartz¹ - Show less +3 more•Institutions (1)

University of Illinois at Urbana–Champaign¹

06 Oct 1993

TL;DR: Pablo is a performance analysis environment designed to provide unobtrusive performance data capture, analysis, and presentation across a wide variety of scalable parallel systems.

...read moreread less

Abstract: Developers of application codes for massively parallel computer systems face daunting performance tuning and optimization problems that must be solved if massively parallel systems are to fulfill their promise. Recording and analyzing the dynamics of application program, system software, and hardware interactions is the key to understanding and the prerequisite to performance tuning, but this instrumentation and analysis must not unduly perturb program execution. Pablo is a performance analysis environment designed to provide unobtrusive performance data capture, analysis, and presentation across a wide variety of scalable parallel systems. Current efforts include dynamic statistical clustering to reduce the volume of data that must be captured and complete performance data immersion via head-mounted displays. >

...read moreread less

Journal Article•DOI•

A New Approach to Country Segmentation Utilizing Multinational Diffusion Patterns

[...]

Kristiaan Helsen¹, Kamel Jedidi², Wayne S. DeSarbo³•Institutions (3)

University of Chicago¹, Columbia University², University of Michigan³

01 Oct 1993-Journal of Marketing

TL;DR: In this article, country segmentation has been proposed to assist in marketing strategy decisions for international marketing managers, such schemes typically consist of grouping or clustering a set of specified co...

...read moreread less

Abstract: Country segmentation has been proposed to assist in marketing strategy decisions for international marketing managers. Such schemes typically consist of grouping or clustering a set of specified co...

...read moreread less

Book Chapter•DOI•

A Fast Genetic Algorithm with Sharing Scheme Using Cluster Analysis Methods in Multimodal Function Optimization

[...]

Xiaodong Yin¹, Noël. Germay¹•Institutions (1)

Catholic University of Leuven¹

01 Jan 1993

TL;DR: A sharing scheme using a clustering methodology is introduced and compared with the classical sharing scheme and it is shown from the simulation on test functions and on a practical problem that the proposed scheme proceeds faster than the classical scheme with a performance remaining as good as the classical one.

...read moreread less

Abstract: Genetic algorithms with sharing are well known for tackling multimodal function optimization problems. In this paper, a sharing scheme using a clustering methodology is introduced and compared with the classical sharing scheme. It is shown from the simulation on test functions and on a practical problem that the proposed scheme proceeds faster than the classical scheme with a performance remaining as good as the classical one. In addition, the proposed scheme reveals unknown multimodal function structure when a priori knowledge about the function is poor. Finally, introduction of a mating restriction inside the proposed scheme is investigated and shown to increase the optimization quality without requiring additional computation efforts.

...read moreread less

Journal Article•DOI•

Min-cut clustering

[...]

Ellis L. Johnson¹, Anuj Mehrotra¹, George L. Nemhauser¹•Institutions (1)

Georgia Institute of Technology¹

21 Oct 1993-Mathematical Programming

TL;DR: A decomposition framework and a column generation scheme for solving a min-cut clustering problem that is itself an NP-hard mixed integer programming problem and some efficient solution strategies are described.

...read moreread less

Abstract: We describe a decomposition framework and a column generation scheme for solving a min-cut clustering problem. The subproblem to generate additional columns is itself an NP-hard mixed integer programming problem. We discuss strong valid inequalities for the subproblem and describe some efficient solution strategies. Computational results on compiler construction problems are reported.

...read moreread less

Journal Article•DOI•

Statistical clustering techniques for the analysis of long molecular dynamics trajectories : analysis of 2.2-ns trajectories of YPGDV

[...]

Mary E. Karpen¹, Douglas J. Tobias, Charles L. Brooks•Institutions (1)

Carnegie Mellon University¹

19 Jan 1993-Biochemistry

TL;DR: An automated approach is developed, based on self-organizing neural nets, to extract the key features of the molecular dynamics trajectory of a pentapeptide Tyr-Pro-Gly-Asp-Val that forms stable reverse turns in solution.

...read moreread less

Abstract: The microscopic interactions and mechanisms leading to nascent protein folding events are generally unknown. While such short time-scale events are difficult to study experimentally, molecular dynamics simulations of peptides can provide a useful model for studying events related to protein folding initiation. Recently, two extremely long molecular dynamics simulations (2.2 ns each) were carried out on the pentapeptide Tyr-Pro-Gly-Asp-Val [Tobias, D. J., Mertz, J. E., & Brooks, C. L., III (1991) Biochemistry 30, 6054-6058] that forms stable reverse turns in solution. Tobias et al. examined folding events in this large system (approximately 30,000 conformations) using traditional methods of trajectory analysis. The shear magnitude of this problem prompted us to develop an automated approach, based on self-organizing neural nets, to extract the key features of the molecular dynamics trajectory. The neural net is used to perform conformational clustering, which reduces the complexity of a system while minimizing the loss of information. The conformations were grouped together using distances in dihedral angle space as a measure of conformational similarity. The resulting clusters represent "conformational states", and transitions between these states were examined to identify mechanisms of conformational change. Many conformational changes involved the rotation of only a single dihedral angle, but concerted angle changes were also found. Most of the conformational information in the 30,000 samples from the full trajectories was retained in the relatively few resultant clusters, providing a powerful tool for analysis of an expanding base of large molecular simulations.

...read moreread less

Proceedings Article•

Improved Clustering Techniques for Class-Based Statistical Language Modelling

[...]

Reinhard Kneser, Hermann Ney

01 Jan 1993

Book Chapter•DOI•

Self-Organizing Neural Networks for Visualisation and Classification

[...]

A. Ultsch

01 Jan 1993

TL;DR: It has been demonstrated, that the usage of an artificial neural network, Kohonen’s self organizing feature map, for visualisation and classification of high dimensional data, can be used also for knowledge acquisition and exploratory data analysis purposes.

...read moreread less

Abstract: This paper presents the usage of an artificial neural network, Kohonen’s self organizing feature map, for visualisation and classification of high dimensional data. Through a learning process, this neural network creates a mapping from a N-dimensional space to a two-dimensional plane of units (neurons). This mapping is known to preserve topological relations of the N-dimensional space. A specially developed technique, called U-matrix method has been developed in order to detect nonlinearities in the resulting mapping. This method can be used to visualize structures of the N-dimensional space. Boundaries between different subsets of input data can be detectet. This allows to use this method for a clustering of the data. New data can be classified in an associative way. It has been demonstrated, that the method can be used also for knowledge acquisition and exploratory data analysis purposes.

...read moreread less

Proceedings Article•DOI•

A Clustering-Based Optimization Algorithm in Zero-Skew Routings

[...]

Masato Edahiro¹•Institutions (1)

Princeton University¹

01 Jul 1993

TL;DR: A zero-skew routing algorithm with clustering and improvement methods is proposed that achieves 20% reduction of the total wire length on benchmark data compared with the best known algorithm.

...read moreread less

Abstract: A zero-skew routing algorithm with clustering and improvement methods is proposed. This algorithm generates a zero-skew routing in O(n log n) time for n pins, and it is proven that the order of the total wire length is best possible. Our algorithm achieves 20% reduction of the total wire length on benchmark data compared with the best known algorithm.

...read moreread less

Journal Article•DOI•

Incremental clustering for dynamic information processing

[...]

Fazli Can¹•Institutions (1)

Miami University¹

01 Apr 1993-ACM Transactions on Information Systems

TL;DR: Through empirical testing it is shown that the algorithm achieves cost effectiveness and generates statistically valid clusters that are compatible with those of reclustering.

...read moreread less

Abstract: Clustering of very large document databases is useful for both searching and browsing. The periodic updating of clusters is required due to the dynamic nature of databases. An algorithm for incremental clustering is introduced. The complexity and cost analysis of the algorithm together with an investigation of its expected behavior are presented. Through empirical testing it is shown that the algorithm achieves cost effectiveness and generates statistically valid clusters that are compatible with those of reclustering. The experimental evidence shows that the algorithm creates an effective and efficient retrieval environment.

...read moreread less

Journal Article•DOI•

Construction of fuzzy models through clustering techniques

[...]

Y. Yoshinari, Witold Pedrycz¹, Kaoru Hirota²•Institutions (2)

University of Manitoba¹, Hosei University²

10 Mar 1993-Fuzzy Sets and Systems

TL;DR: In comparison to the algorithms existing in the literature and producing function-like models, the proposed fuzzy models designed with the aid of fuzzy clustering is of a relational character allowing for multidirectional accessibility.

...read moreread less

Journal Article•DOI•

Variety and generality of clustering in globally coupled oscillators

[...]

Koji Okuda¹•Institutions (1)

Kyoto University¹

15 Mar 1993-Physica D: Nonlinear Phenomena

TL;DR: In this paper, the phase model was used to analyze the stability of symmetric cluster states in globally coupled identical oscillators, where each pair of oscillators is assumed to interact through their phase difference.

...read moreread less

Journal Article•DOI•

Comparing three classification strategies for use in ecology

[...]

Lee Belbin¹, Cam McDonald¹•Institutions (1)

Commonwealth Scientific and Industrial Research Organisation¹

01 Jun 1993-Journal of Vegetation Science

TL;DR: Recovery of the embedded clusters suggests that both flexible UPGMA and ALOC are signifi- cantly better than TWINSPAN.

...read moreread less

Abstract: We compare three common types of clustering algorithms for use with community data. TWINSPAN is divi- sive hierarchical, flexible-UPGMA is agglomerative and hierarchical, and ALOC is non-hierarchical. A balanced de- sign six-factor model was used to generate 480 data sets of known characteristics. Recovery of the embedded clusters suggests that both flexible UPGMA and ALOC are signifi- cantly better than TWINSPAN. No significant difference existed between flexible UPGMA and ALOC.

...read moreread less

Collapse