Showing papers on "Cluster analysis published in 1984"

PDF

Open Access

Journal Article•DOI•

FCM: The fuzzy c-means clustering algorithm

[...]

James C. Bezdek¹, Robert Ehrlich², William E. Full³•Institutions (3)

Utah State University¹, University of South Carolina², Wichita State University³

01 Jan 1984-Computers & Geosciences

TL;DR: A FORTRAN-IV coding of the fuzzy c -means (FCM) clustering program is transmitted, which generates fuzzy partitions and prototypes for any set of numerical data.

...read moreread less

5,287 citations

Journal Article•DOI•

K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality

[...]

Shokri Z. Selim, Mohamed A. Ismail¹•Institutions (1)

University of Windsor¹

01 Jan 1984-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is shown that under certain conditions the K-means algorithm may fail to converge to a local minimum, and that it converges under differentiability conditions to a Kuhn-Tucker point.

...read moreread less

Abstract: The K-means algorithm is a commonly used technique in cluster analysis. In this paper, several questions about the algorithm are addressed. The clustering problem is first cast as a nonconvex mathematical program. Then, a rigorous proof of the finite convergence of the K-means-type algorithm is given for any metric. It is shown that under certain conditions the algorithm may fail to converge to a local minimum, and that it converges under differentiability conditions to a Kuhn-Tucker point. Finally, a method for obtaining a local-minimum solution is given.

...read moreread less

1,180 citations

Journal Article•DOI•

Efficient algorithms for agglomerative hierarchical clustering methods

[...]

William H. E. Day¹, Herbert Edelsbrunner²•Institutions (2)

Memorial University of Newfoundland¹, Graz University of Technology²

01 Dec 1984-Journal of Classification

TL;DR: A centroid SAHN clustering algorithm that requires 0(n2) time, in the worst case, for fixedk and for a family of dissimilarity measures including the Manhattan, Euclidean, Chebychev and all other Minkowski metrics is described.

...read moreread less

Abstract: Whenevern objects are characterized by a matrix of pairwise dissimilarities, they may be clustered by any of a number of sequential, agglomerative, hierarchical, nonoverlapping (SAHN) clustering methods. These SAHN clustering methods are defined by a paradigmatic algorithm that usually requires 0(n 3) time, in the worst case, to cluster the objects. An improved algorithm (Anderberg 1973), while still requiring 0(n 3) worst-case time, can reasonably be expected to exhibit 0(n 2) expected behavior. By contrast, we describe a SAHN clustering algorithm that requires 0(n 2 logn) time in the worst case. When SAHN clustering methods exhibit reasonable space distortion properties, further improvements are possible. We adapt a SAHN clustering algorithm, based on the efficient construction of nearest neighbor chains, to obtain a reasonably general SAHN clustering algorithm that requires in the worst case 0(n 2) time and space. Whenevern objects are characterized byk-tuples of real numbers, they may be clustered by any of a family of centroid SAHN clustering methods. These methods are based on a geometric model in which clusters are represented by points ink-dimensional real space and points being agglomerated are replaced by a single (centroid) point. For this model, we have solved a class of special packing problems involving point-symmetric convex objects and have exploited it to design an efficient centroid clustering algorithm. Specifically, we describe a centroid SAHN clustering algorithm that requires 0(n 2) time, in the worst case, for fixedk and for a family of dissimilarity measures including the Manhattan, Euclidean, Chebychev and all other Minkowski metrics.

...read moreread less

877 citations

Journal Article•DOI•

An identification algorithm in fuzzy relational systems

[...]

Witold Pedrycz

01 Jul 1984-Fuzzy Sets and Systems

TL;DR: An algorithm for constructing models on the basis of fuzzy and nonfuzzy data with the aid of fuzzy discretization and clustering techniques is proposed.

...read moreread less

524 citations

Journal Article•DOI•

The Distance Traveled to Visit N Points with a Maximum of C Stops per Vehicle: An Analytic Model and an Application

[...]

Carlos F. Daganzo¹•Institutions (1)

University of California, Berkeley¹

01 Nov 1984-Transportation Science

TL;DR: A simple formula to predict the distance traveled by fleets of vehicles in physical distribution problems involving a depot and its area of influence is developed.

...read moreread less

Abstract: The purpose of this paper is to develop a simple formula to predict the distance traveled by fleets of vehicles in physical distribution problems involving a depot and its area of influence. Since the transportation cost of operating a break-bulk terminal or a warehouse is intimately related to the distance traveled, the availability of such a simple formula should facilitate the study of more complex logistics problems. A simple manual dispatching strategy intended to mimic what dispatchers do, but simple enough to admit analytical modeling is presented. Since the formulas agree rather well with the length of nearly optimal computer built tours, the predictions should approximate distances achievable in practice; the formulas seem realistic. The technique is a variant of the classical “cluster-first, route-second” approach to vehicle routing problems. In these approaches, the depot influence area is first partitioned into districts containing clusters of stops; one vehicle route is then constructed to serve each cluster. Our procedure is characterized by the way district shapes are chosen; ignoring shape during the clustering step can increase significantly travel distances. The technique is simple. To exercise it, one needs only a pencil, eraser, and a scale map showing the destinations. Once mastered, the technique takes only a few minutes. This time should increase only linearly with the number of destinations. For repetitive problems, the technique can be enhanced with the help of interactive computer graphics. A newspaper delivery problem for the city of San Francisco is used as an illustration.

...read moreread less

258 citations

Journal Article•DOI•

Solving capacitated clustering problems

[...]

John M. Mulvey, Michael P. Beck¹•Institutions (1)

Princeton University¹

01 Dec 1984-European Journal of Operational Research

TL;DR: Empirical results using both a primal heuristic and a hybrid heuristic-subgradient method for problems having n ⩽ 100 show that the algorithms locate close to optimal solutions without resorts to tree enumeration.

...read moreread less

170 citations

Journal Article•DOI•

Thermodynamics and galaxy clustering: Nonlinear theory of high order correlations

[...]

W. C. Saslaw, Andrew J. S. Hamilton

01 Jan 1984-The Astrophysical Journal

TL;DR: In this article, the authors developed a new theory for galaxy clustering in an expanding universe based on the thermodynamics of gravitating systems and applied to the highly nonlinear regime of strong clustering.

...read moreread less

Abstract: We develop a new theory for galaxy clustering in an expanding universe. It is based on the thermodynamics of gravitating systems and applies to the highly nonlinear regime of strong clustering. There are no free parameters in the simplest form of this theory. It predicts distribution functions of all orders, from voids to hundreds of galaxies. Comparison of these predictions with the results of numerical N-body experiments shows substantial agreement. Comparison with the observed distribution of galaxies may determine whether it has unrelaxed structure that retains information from much easier epochs of the universe.

...read moreread less

152 citations

Posted Content•

Synthesized Clustering a Method for Amalgamating Alternative Clustering Bases with Differential Weighting of Variables

[...]

Wayne S. DeSarbo¹, J. Douglas Carroll², Linda A. Clark³, Paul E. Green⁴•Institutions (4)

Pennsylvania State University¹, Rutgers University², Bell Labs³, University of Pennsylvania⁴

01 Mar 1984-Social Science Research Network

TL;DR: A new method is proposed (SYNCLUS, SYNthesizedCLUStering) for dealing with the problem of how can the various contributory variables in a specific battery be weighted so as to enhance some cluster structure that may be present.

...read moreread less

Abstract: In the application of clustering methods to real world data sets, two problems frequently arise: (a) how can the various contributory variables in a specific battery be weighted so as to enhance some cluster structure that may be present, and (b) how can various alternative batteries be combined to produce a single clustering that "best" incorporates each contributory set. A new method is proposed (SYNCLUS, SYNthesized CLUStering) for dealing with these two problems.

...read moreread less

146 citations

Journal Article•DOI•

A bootstrap resampling analysis of galaxy clustering

[...]

John D. Barrow¹, Suketu G. Bhavsar¹, D. H. Sonoda¹•Institutions (1)

University of Sussex¹

01 Sep 1984-Monthly Notices of the Royal Astronomical Society

TL;DR: On applique la methode a la fonction de correlation angulaire a deux points du catalogue de galaxies de 14 mag de Zwicky as discussed by the authors, on trouve des erreurs standards de σ(θ 0 )=0,01 and σ (γ)=0,13 avec des moyennes de γ= 0,80 and θ 0 =0,06 radians.

...read moreread less

Abstract: On applique la methode a la fonction de correlation angulaire a deux points du catalogue de galaxies de 14 mag de Zwicky. Si la fonction a deux points est de la forme ω(θ)=(θ 0 /θ)γ, on trouve des erreurs standards de σ(θ 0 )=0,01 et σ(γ)=0,13 avec des moyennes de γ=0,80 et θ 0 =0,06 radians

...read moreread less

141 citations

Journal Article•DOI•

Unified Description of Static and Dynamic Scaling for Kinetic Cluster Formation

[...]

M. Kolb¹•Institutions (1)

Free University of Berlin¹

22 Oct 1984-Physical Review Letters

TL;DR: In this paper, a scaling theory for aggregation by means of kinetic clustering of clusters is developed, whereby a global picture of static and dynamic critical properties emerges, whereby the dynamic critical exponent can be related to the fractal dimension.

...read moreread less

Abstract: A scaling theory is developed for aggregation by means of kinetic clustering of clusters. A global picture of static and dynamic critical properties emerges, whereby the dynamic critical exponent can be related to the fractal dimension. Furthermore, the growth process is described in terms of a purely kinetic model. The scaling predictions agree well with numerical results.

...read moreread less

138 citations

Journal Article•DOI•

Synthesized clustering: a method for amalgamating alternative clustering bases with differential weighting of variables

[...]

Wayne S. DeSarbo¹, J. Douglas Carroll¹, Linda A. Clark¹, Paul E. Green²•Institutions (2)

Bell Labs¹, University of Pennsylvania²

01 Mar 1984-Psychometrika

TL;DR: In this paper, the authors proposed a new method (SYNCLUS, SYNthesizedCLUStering) for dealing with two problems: (a) how to the various contributory variables in a specific battery be weighted so as to enhance some cluster structure that may be present, and (b) how can various alternative batteries be combined to produce a single clustering that best incorporates each contributory set.

...read moreread less

Abstract: In the application of clustering methods to real world data sets, two problems frequently arise: (a) how can the various contributory variables in a specific battery be weighted so as to enhance some cluster structure that may be present, and (b) how can various alternative batteries be combined to produce a single clustering that “best” incorporates each contributory set. A new method is proposed (SYNCLUS, SYNthesizedCLUStering) for dealing with these two problems.

...read moreread less

Journal Article•DOI•

Picture Indexing and Abstraction Techniques for Pictorial Databases

[...]

Shi-Kuo Chang¹, Shao-Hung Liu²•Institutions (2)

Illinois Institute of Technology¹, University of California, Berkeley²

01 Apr 1984-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work presents an approach for picture indexing and abstraction, and illustrates by examples how to apply abstraction operations to obtain various picture indexes, and how to construct icons to facilitate accessing of pictorial data.

...read moreread less

Abstract: We present an approach for picture indexing and abstraction. Picture indexing facilitates information retrieval from a pictorial database consisting of picture objects and picture relations. To construct picture indexes, abstraction operations to perform picture object clustering and classification are formulated. To substantiate the abstraction operations, we also formalize syntactic abstraction rules and semantic abstraction rules. We then illustrate by examples how to apply these abstraction operations to obtain various picture indexes, and how to construct icons to facilitate accessing of pictorial data.

...read moreread less

Journal Article•DOI•

Stochastic Methods for Global Optimization

[...]

A. H. G. Rinnooy Kan¹, G.T. Timmer¹•Institutions (1)

Erasmus University Rotterdam¹

01 Jan 1984-American Journal of Mathematical and Management Sciences

TL;DR: In this article, the authors focus on methods that are based on a random sample of points and that use a combination of clustering and local search to identify all the local optima that are potentially global.

...read moreread less

Abstract: SYNOPTIC ABSTRACTThe most efficient methods for finding the global minimum of an objective function (not necessarily convex) are those that embody stochastic elements In this survey, we focus on methods that are based on a random sample of points and that use a combination of clustering and local search to identify all the local optima that are potentially global Special attention is paid to a proper termination criterion for the sequence of sampling, clustering and searching, and to the analysis of the result produced by the method

...read moreread less

Of wide applicability

[...]

D. A. Ratkowsky

01 Jan 1984

TL;DR: In this paper, a new method for determining the number of groups in a numerical classification is proposed, based on the average similarity of an individual with the members of its group, including itself.

...read moreread less

Abstract: A new method for determining the number of groups in a numerical classification is proposed. Extensive tests of the criterion for the "correct" or "optimum" number of groups are reported. The criterion may be used with any definition of similarity whose possible values are bounded by zero and unity, and with any agglomerative clustering method, whether it be hierarchical or nonhierarchical. It may also be used in conjunction with divisive clustering methods for which the similarity coefficients can conveniently be obtained. The procedure is based on the average similarity of an individual with the members of its group, including itself, and readily lends itself to interactive computation if one wishes to find the partition that maximizes the overall average similarity for a given number of groups. In that sense, the procedure may also be considered to be a clustering method.

...read moreread less

Journal Article•DOI•

How to Detect Travel Market Segments: A Clustering Approach

[...]

Josef A. Mazanec

01 Jul 1984-Journal of Travel Research

TL;DR: A cluster-analytic approach to benefit segmentation is described using an illustrative empirical example drawn from the Austrian domestic travel market, demonstrating that some benefits are incompatible with each other at the segment level, and that the "average" vacationer may be only a statistical artifact.

...read moreread less

Abstract: A cluster-analytic approach to benefit segmentation is described using an illustrative empirical example drawn from the Austrian domestic travel market. The results demonstrate that some benefits are incompatible with each other at the segment level, and that the "average" vacationer may be only a statistical artifact.

...read moreread less

Journal Article•DOI•

Soft clustering of multidimensional data: a semi-fuzzy approach

[...]

Shokri Z. Selim, Mohamed A. Ismail¹•Institutions (1)

University of Windsor¹

01 Jan 1984-Pattern Recognition

TL;DR: New approaches to unsupervised fuzzy classification of multidimensional data by using ‘semi-fuzzy’ or ‘soft’ clustering techniques to achieve this goal are discussed.

...read moreread less

Journal Article•DOI•

The detection of disease clustering in time.

[...]

Toshiro Tango

01 Mar 1984-Biometrics

TL;DR: A new index for the level of disease clustering in time, which is devised to the case where the data are grouped into several equally spaced intervals, applicable to both temporal and cyclical clustering.

...read moreread less

Abstract: This paper presents a new index for the level of disease clustering in time, which is devised to the case where the data are grouped into several equally spaced intervals. This index is applicable to both temporal and cyclical clustering. The asymptotic distribution of this index is derived under the null hypothesis of no clustering in time. Monte Carlo simulation studies show that the asymptotic results are good approximations when the sample size is as small as the number of intervals, an average of one per interval. The powers of the test based on this index for both types of clustering are compared with those of several existing procedures. Tables of upper percentage points of this index are given.

...read moreread less

Journal Article•DOI•

The generation of random, binary unordered trees

[...]

George W. Furnas¹•Institutions (1)

Telcordia Technologies¹

01 Dec 1984-Journal of Classification

TL;DR: Several techniques are given for the uniform generation of trees for use in Monte Carlo studies of clustering and tree representations and general strategies are reviewed for random selection from a set of combinatorial objects.

...read moreread less

Abstract: Several techniques are given for the uniform generation of trees for use in Monte Carlo studies of clustering and tree representations. First, general strategies are reviewed for random selection from a set of combinatorial objects with special emphasis on two that use random mapping operations. Theorems are given on how the number of such objects in the set (e.g., whether the number is prime) affects which strategies can be used. Based on these results, methods are presented for the random generation of six types of binary unordered trees. Three types of labeling and both rooted and unrooted forms are considered. Presentation of each method includes the theory of the method, the generation algorithm, an analysis of its computational complexity and comments on the distribution of trees over which it samples. Formal proofs and detailed algorithms are in appendices.

...read moreread less

Journal Article•DOI•

Constrained Classification : The Use of a Priori Information in Cluster Analysis

[...]

Wayne S. DeSarbo¹, Vijay Mahajan²•Institutions (2)

Bell Labs¹, Southern Methodist University²

01 Jun 1984-Psychometrika

TL;DR: The CONCLUS model and algorithm are described in detail, as well as their flexibility for use in various applications, and Monte Carlo results are presented for two synthetic data sets with appropriate discussion of the resulting implications.

...read moreread less

Abstract: In many classification problems, one often possesses external and/or internal information concerning the objects or units to be analyzed which makes it appropriate to impose constraints on the set of allowable classifications and their characteristics. CONCLUS, or CONstrained CLUStering, is a new methodology devised to perform constrained classification in either an overlapping or nonoverlapping (hierarchical or nonhierarchial) manner. This paper initially reviews the related classification literature. A discussion of the use of constraints in clustering problems is then presented. The CONCLUS model and algorithm are described in detail, as well as their flexibility for use in various applications. Monte Carlo results are presented for two synthetic data sets with appropriate discussion of the resulting implications. An illustration of CONCLUS is presented with respect to a sales territory design problem where the objects classified are various Forbes-500 companies. Finally, the discussion section highlights the main contribution of the paper and offers some areas for future research.

...read moreread less

Journal Article•DOI•

Applications of clustering in exploration seismology

[...]

Fred Aminzadeh¹, Shankar Chatterjee¹•Institutions (1)

University of Southern California¹

01 Oct 1984-Geoexploration

TL;DR: This work gives an overview of the cluster analysis and pattern recognition methods and performs hierarchical clustering on a small amount of data for reducing the feature set to an orthogonal one.

...read moreread less

Journal Article•DOI•

A program for the processing of analytical data (DPP)

[...]

P. Van Espen¹•Institutions (1)

University of Antwerp¹

01 Jan 1984-Analytica Chimica Acta

TL;DR: The DPP program is equipped with a leading verb command language for input and job scheduling, thus providing an efficient and user-friendly operator/program interface, and with a data-base organization that accommodates a wide variety of data structures.

...read moreread less

Journal Article•DOI•

Nonparametric Data Reduction

[...]

Keinosuke Fukunaga¹, James M. Mantock¹•Institutions (1)

Purdue University¹

01 Jan 1984-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A nonparametric data reduction technique is proposed that is iterative and based on the use of a criterion function and nearest neighbor density estimates to select samples that are ``representative'' of the entire data set.

...read moreread less

Abstract: A nonparametric data reduction technique is proposed. Its goal is to select samples that are ``representative'' of the entire data set. The technique is iterative and is based on the use of a criterion function and nearest neighbor density estimates. Experiments are presented to demonstrate the algorithm.

...read moreread less

Posted Content•

The Representation of Three-Way Proximity Data by Single and Multiple Tree Structure Models

[...]

J. Douglas Carroll¹, Linda A. Clark², Wayne S. DeSarbo³•Institutions (3)

Rutgers University¹, Bell Labs², Pennsylvania State University³

01 Dec 1984-Social Science Research Network

TL;DR: A new methodology called INDTREES (for INdividual Differences in TREE Structures) for fitting various(discrete) tree structures to three-way proximity data for relieving three common types of maladies.

...read moreread less

Abstract: Models for the representation of proximity data (similarities/dissimilarities) can be categorized into one of three groups of models: continuous spatial models, discrete nonspatial models, and hybrid models (which combine aspects of both spatial and discrete models). Multidimensional scaling models and associated methods, used for the spatial representation of such proximity data, have been devised to accommodate two, three, and higher-way arrays. At least one model/method for overlapping (but generally non-hierarchical) clustering called INDCLUS (Carroll and Arabic 1983) has been devised for the ease of three-way arrays of proximity data. Tree-fitting methods, used for the discrete network representation of such proximity data, have only thus far been devised to handle two-way arrays. This paper develops a new methodology called INDTREES (for individual Differences in TREE Structures) for fitting various (discrete) tree structures to three-way proximity data. This individual differences generalization is one in which different individuals, for example, are assumed to base their judgments on the same family of trees, but are allowed to have different node heights and/or branch lengths. We initially present an introductory overview focussing on existing two-way models. The INDTREES model and algorithm are then described in detail. Monte Carlo results for the INDTREES fitting of four different three-way data sets are presented. In the application, a single ultrametric tree is fitted to three-way proximity data derived from intention-to-buy-data for various brands of over-the-counter pain relievers for relieving three common types of maladies. Finally, we briefly describe how the INDTREES procedure can be extended to accommodate hybrid modelling, as well as to handle other types of applications.

...read moreread less

Posted Content•

On the Use of Hierarchical Clustering for the Analysis of Nonsymmetric Proximities

[...]

Wayne S. DeSarbo¹•Institutions (1)

Pennsylvania State University¹

01 Jun 1984-Social Science Research Network

TL;DR: This work introduces a recently developed clustering method that appears to be more suited to the analysis of such nonsymmetric data, and describes an application and comparison of the various approaches.

...read moreread less

Abstract: Rao and Sabavala (1981) recently proposed a hierarchical clustering methodology applied to normalized brand switching matrices to assess competitive market structure. We introduce a recently developed clustering method that appears to be more suited to the analysis of such nonsymmetric data, and describe an application and comparison of the various approaches.

...read moreread less

Journal Article•DOI•

A study of three techniques for time-space clustering in Hodgkin's disease.

[...]

Rina Chen¹, Nathan Mantel², Marcus A. Klingberg³•Institutions (3)

Israel Institute for Biological Research¹, George Washington University², Tel Aviv University³

01 Apr 1984-Statistics in Medicine

TL;DR: Three techniques for disease time-space clustering analysis, those of Knox, Mantel and Ederer-Myers-Mantel, were applied to simulated data so as to study their sensitivities and indicate that the three techniques may not be sufficiently sensitive to the clustering in a real data set of HD cases.

...read moreread less

Abstract: Three techniques for disease time-space clustering analysis, those of Knox, Mantel and Ederer-Myers-Mantel, were applied to simulated data so as to study their sensitivities. The simulated data corresponded to three alternative non-null models for the distribution, transmission and development of Hodgkin's disease (HD) which were formulated in accordance with the results of published studies. The results indicate that the three techniques may not be sufficiently sensitive to the clustering in a real data set of HD cases. Therefore, the inconclusive results obtained to date with regard to clustering of HD may be related to the low power of the statistical techniques employed.

...read moreread less

Journal Article•DOI•

The representation of three-way proximity data by single and multiple tree structure models

[...]

J. Douglas Carroll¹, Linda A. Clark¹, Wayne S. DeSarbo¹•Institutions (1)

Bell Labs¹

01 Dec 1984-Journal of Classification

TL;DR: In this paper, an individual differences generalization is proposed for fitting various discrete tree structures to three-way proximity data, in which different individuals are assumed to base their judgments on the same family of trees, but are allowed to have different node heights and/or branch lengths.

...read moreread less

Journal Article•DOI•

On the Use of Hierarchical Clustering for the Analysis of Nonsymmetric Proximities

[...]

Wayne S. DeSarbo, Geert De Soete

01 Jun 1984-Journal of Consumer Research

TL;DR: In this paper, a hierarchical clustering method was proposed for the analysis of nonsymmetric data, and an application and comparison of the various approaches were described, and compared with a recently developed clustering approach.

...read moreread less

Journal Article•DOI•

A Tree-Matching Algorithm Based on Node Splitting and Merging

[...]

01 Jan 1984-IEEE Transactions on Pattern Analysis and Machine Intelligence

Journal Article•DOI•

A modified K‐means clustering algorithm for use in speaker‐independent isolated word recognition

[...]

J. G. Wilpon, Lawrence R. Rabiner

01 May 1984-Journal of the Acoustical Society of America

TL;DR: A new clustering algorithm based on a K‐means approach which requires no user parameter specification is presented, which performs as well or better than the previously used clustering techniques when tested as part of a speaker independent isolated word recognition system.

...read moreread less

Abstract: Recent studies of isolated word recognition systems have shown that a set of carefully chosen templates can be used to bring the performance of speaker‐independent systems up to that of systems trained to the individual speaker. The earliest work in this area used a sophisticated set of pattern recognition algorithms in a human‐interactive mode to create the set of templates (multiple patterns) for each word in the vocabulary. Not only was this procedure time consuming but it was impossible to reproduce exactly, because it was highly dependent on decisions made by the experimenter. Subsequent work led to an automatic clustering procedure which, given only a set of clustering parameters, clustered tokens with the same performance as the previously developed supervised algorithms. The one drawback of the automatic procedure was that the specification of the input parameter set was found to be somewhat dependent on the vocabulary type and size of population to be clustered. Since the user of such a statistical clustering algorithm could not be expected, in general, to know how to choose the word clustering parameters, even this automatic clustering algorithm was not appropriate for a completely general word recognition system. It is the purpose of this paper to present a new clustering algorithm based on a K‐means approach which requires no user parameter specification. Experimental data show that this new algorithm performs as well or better than the previously used clustering techniques when tested as part of a speaker independent isolated word recognition system.

...read moreread less

Proceedings Article•DOI•

A New Lumping Scheme of Analytical Data for Compositional Studies

[...]

F. Montel¹, P.L. Gouel¹•Institutions (1)

Elf Aquitaine¹

01 Jan 1984-Software - Practice and Experience

TL;DR: A new algorithm to lump fluid compounds into hypothetical components has been designed to bridge the gap between the increasing amount of analytical data provided by modern lab equipments and the simplified fluid description required in compositional model studies.

...read moreread less

Abstract: This paper presents a new algorithm to lump fluid compounds into hypothetical components. It has been designed to bridge the gap between the increasing amount of analytical data provided by modern lab equipments and the simplified fluid description required in compositional model studies. The main feature of the method is a lumping scheme based on the similarities of a few properties of all the compounds identified by chromatographic analysis. An iterative clustering algorithm around mobile centers yields a classification into pseudo-components optimum with respect to the considered equation of state. It can be adapted to any equation of state, any number of compounds or properties. An other noticeable feature of this procedure is the definition of a mixed calculation of critical properties. A criterion to choose between an accurate calculation of true critical properties and the classical mixing rules is suggested. Numerical simulations of PVT behavior were completed for an oil and a gas condensate defined by 150 compounds. The clustering algorithm determined simplified fluids lumped into 7 and 5 components. True critical properties calculation for the hypothetical components and classical mixing rules were tested and the mixed calculation proved to be the best.

...read moreread less