Showing papers on "Cluster analysis published in 1969"

PDF

Open Access

Journal Article•DOI•

A Nonlinear Mapping for Data Structure Analysis

[...]

01 May 1969-IEEE Transactions on Computers

TL;DR: An algorithm for the analysis of multivariate data is presented along with some experimental results that is based upon a point mapping of N L-dimensional vectors from the L-space to a lower-dimensional space such that the inherent data "structure" is approximately preserved.

...read moreread less

Abstract: An algorithm for the analysis of multivariate data is presented along with some experimental results. The algorithm is based upon a point mapping of N L-dimensional vectors from the L-space to a lower-dimensional space such that the inherent data "structure" is approximately preserved.

...read moreread less

3,460 citations

Journal Article•DOI•

A new approach to clustering

[...]

Enrique H. Ruspini¹•Institutions (1)

University of California, Los Angeles¹

01 Jul 1969-Information & Computation

TL;DR: A new method of representation of the reduced data, based on the idea of “fuzzy sets,” is proposed to avoid some of the problems of current clustering procedures and to provide better insight into the structure of the original data.

...read moreread less

Abstract: A general formulation of data reduction and clustering processes is proposed. These procedures are regarded as mappings or transformations of the original space onto a “representation” or “code” space subjected to some constraints. Current clustering methods, as well as three other data reduction techniques, are specified within the framework of this formulation. A new method of representation of the reduced data, based on the idea of “fuzzy sets,” is proposed to avoid some of the problems of current clustering procedures and to provide better insight into the structure of the original data.

...read moreread less

1,452 citations

Journal Article•DOI•

256 NOTE: An Algorithm for Hierarchical Classifications

[...]

David Wishart

01 Mar 1969-Biometrics

TL;DR: An algorithm is given which can be programmed to compute a general classification process for six accepted methods and results derived for the method of optimizing an error sum of squares objective function are derived.

...read moreread less

Abstract: SUMMARY The recent interest in numerical classification and its application in the biological sciences to the evaluation of taxa has prompted the introduction of a large number of clustering processes, many of which are justified by empirical results. There is clearly a need for the formulation of a theoretical approach to the subject, and this is aided by the comparison and generalisation of existing processes by analytic methods. The work of Lance and Williams in this direction is supplemented here by results derived for the method of optimizing an error sum of squares objective function, and an algorithm is given which can be programmed to compute a general classification process for six accepted methods.

...read moreread less

371 citations

Journal Article•DOI•

Clustering and organization in free recall.

[...]

Thomas J. Shuell

01 Nov 1969-Psychological Bulletin

TL;DR: In this article, the authors reviewed and evaluated research on clustering and subjective organization (SO) in free recall, and two intercorrelation matrices among clustering measures and the number of words recalled are presented.

...read moreread less

Abstract: Research on clustering and subjective organization (SO) in free recall is reviewed and evaluated. Various indexes developed to measure clustering and SO are evaluated, and two intercorrelation matrices among clustering measures and the number of words recalled are presented. The existence of a large negative bias in the correlation between the ratio of repetition (RR) measure and recall is demonstrated. Various theoretical issues which have developed from the study of organization in free recall are presented and discussed.

...read moreread less

193 citations

Book Chapter•DOI•

Toward a practical method which helps uncover the structure of a set of multivariate observations by finding the linear transformation which optimizes a new “index of condensation”

[...]

Joseph B. Kruskal¹•Institutions (1)

Bell Labs¹

01 Jan 1969

TL;DR: A major problem in data analysis is to find any structure in a set of multivariate observations, if each observation is represented as a point in multidimensional space, this means finding the structure of a configuration of points in high-dimensional space.

...read moreread less

Abstract: Publisher Summary A major problem in data analysis is to find any structure in a set of multivariate observations. If each observation is represented as a point in multidimensional space, this means finding the structure of a configuration of points in high-dimensional space. To find linear relationships among the variables, linear regression, principal components, and factor analysis are often used. One very simple and important kind of structure is clustering. Whenever the points cluster together, the knowledge of this is almost sure to be useful to the man who is interested in the data. Some structure-seeking methods depend on the distances between the points. For example, many cluster-seeking techniques look for collections of points whose interpoint distances are small in some sense. Similarly, the method of parametric mapping starts by calculating the matrix of interpoint distances of the original configuration and subsequently works only with that.

...read moreread less

143 citations

Journal Article•DOI•

On the methods and theory of clustering.

[...]

Joseph L. Fleiss, Joseph Zubin

01 Apr 1969-Multivariate Behavioral Research

TL;DR: The key defect in almost all clustering procedures seems to be the absence of a statistical model, and the suggestion is made that the clustering problem be stated as a mixture problem.

...read moreread less

Abstract: The need for methods of clustering individuals into homogeneous groups seems clear. One hopes, by applying them to his data, to discover clusterings which may prove to be important. This aim appears straightforward, but the methods which exist do not necessarily satisfy them. The procedures which employ the correlation measure of profile similarity, and those which employ the distance measure are discussed. Technical and logical problems are shown to exist for both measures. The key defect in almost all clustering procedures seems to be the absence of a statistical model. The suggestion is made that the clustering problem be stated as a mixture problem. The need for further work by psychologists and statisticians is pointed out.

...read moreread less

125 citations

Journal Article•DOI•

A Dynamic Programming Algorithm for Cluster Analysis

[...]

Robert E. Jensen¹•Institutions (1)

University of Maine¹

01 Dec 1969-Operations Research

TL;DR: A dynamic programming approach is presented that reduces the amount of redundant transitional calculations implicit in a total enumeration approach to partitioning N entities into M disjoint and nonempty subsets clusters.

...read moreread less

Abstract: This paper considers the problem of partitioning N entities into M disjoint and nonempty subsets clusters. Except when both N and N-M are very small, a search for the optimal solution by total enumeration of all clustering alternatives is quite impractical. The paper presents a dynamic programming approach that reduces the amount of redundant transitional calculations implicit in a total enumeration approach. A comparison of the number of calculations required under each approach is presented in Appendix A. Unlike most clustering approaches used in practice, the dynamic programming algorithm will always converge on the best clustering solution. The efficiency of the dynamic programming approach depends upon the rapid-access computer memory available. A numerical example is given in Appendix B.

...read moreread less

122 citations

Book•

Clustering and Aggregation in Economics

[...]

Walter Dummer Fisher

01 Feb 1969

109 citations

Journal Article•DOI•

Pattern recognition with measurement space and spatial clustering for multiple images

[...]

Robert M. Haralick¹, G.L. Kelly•Institutions (1)

University of Kansas¹

01 Apr 1969

TL;DR: Remote sensor imaging technology makes it possible to obtain multiple images of extensive land areas simultaneously from the radar, infrared, and visible portions of the electromagnetic spectrum, and it would be useful to automatically obtain from such data land-use maps indicating those areas of similar types of land that are similar as seen through the sensor's eyes.

...read moreread less

Abstract: Remote sensor imaging technology makes it possible to obtain multiple images of extensive land areas simultaneously from the radar, infrared, and visible portions of the electromagnetic spectrum. It would be useful to automatically obtain from such data land-use maps indicating those areas of similar types of land, that is, similar as seen through the sensor's eyes. This classification problem is approached from the perspective of the structure inherent in the data. The classification categories or clusters so constructed are the natural homogeneous groupings within the data. There is high similarity within each cluster and high dissimilarity between clusters. Two clustering procedures are presented: the first partitions the image sequence and the second partitions the measurement space. In both, the partition is constructed by finding appropriate center sets and then chaining to them all similar enough points. The resulting clusters are simply connected and not necessarily convex. An example of the measurement space clustering procedure is presented for a set of three multispectral images taken over Phoenix, Ariz.

...read moreread less

79 citations

Journal Article•DOI•

A Note on Proximity Measures and Cluster Analysis

[...]

Paul E. Green¹, Vithala R. Rao¹•Institutions (1)

University of Pennsylvania¹

01 Aug 1969-Journal of Marketing Research

TL;DR: This article shows some of the interrelationships among various measures that have been suggested for summarizing pairwise proximities and to demonstrate that clustering results are not invariant over these alternative measures.

...read moreread less

Abstract: Clustering techniques and related approaches to numerical classification are beginning to receive a fair amount of attention by marketing researchers. Three articles on the subject [2, 9, 11] have already appeared in JMR, and a variety of marketing studies using clustering procedures have been reported in working papers. One of the principal problems in applying cluster analysis is the choice of what proximity measure to use in summarizing the similarity (or dissimilarity) of profile pairs. Morrison [10] discussed some problems associated with using a Euclidean distance measure in the space of original variables, a point also made by Overall [12] in the psychological literature. This article shows some of the interrelationships among various measures that have been suggested for summarizing pairwise proximities and to demonstrate that clustering results are not invariant over these alternative measures. Despite the arguments for using one measure in preference to another, we believe that no "dominant" proximity measure currently exists, given such high variation in the researcher's objectives [5]. The ten proximity measures used in this comparative study follow:

...read moreread less

64 citations

Journal Article•DOI•

Feature Extraction on Binary Patterns

[...]

George Nagy¹•Institutions (1)

IBM¹

01 Oct 1969-IEEE Transactions on Systems Science and Cybernetics

TL;DR: A modified version of the Isodata or K-means clustering algorithm is applied to a set of patterns originally proposed by Block, Nilsson, and Duda, and to another artificial alphabet.

...read moreread less

Abstract: The objects and methods of automatic feature extraction on binary patterns are briefly reviewed. An intuitive interpretation for geometric features is suggested whereby such a feature is conceived of as a cluster of component vectors in pattern space. A modified version of the Isodata or K-means clustering algorithm is applied to a set of patterns originally proposed by Block, Nilsson, and Duda, and to another artificial alphabet. Results are given in terms of a figure-of-merit which measures the deviation between the original patterns and the patterns reconstructed from the automatically derived feature set.

...read moreread less

Journal Article•DOI•

Characterization of penetrant clustering in polymers

[...]

T. A. Orofino¹, H. B. Hopfenberg², V. Stannett²•Institutions (2)

Durham University¹, North Carolina State University²

01 Dec 1969-Journal of Macromolecular Science, Part B

TL;DR: In this paper, the formal description of penetrant clustering in polymers given by Zimm and Lundberg is applied to systems obeying Flory-Huggins thermodynamics.

...read moreread less

Abstract: The formal description of penetrant clustering in polymers given by Zimm and Lundberg is applied to systems obeying Flory-Huggins thermodynamics. Several equivalent procedures for evaluation of the cluster function are suggested including circumstances in which the interaction parameter χ1 varies with composition. An analysis of the cluster function for penetrant and the companion expression for clustering of the polymeric solute is presented for some familiar, special cases of Flory-Huggins' behavior. The utility of the analyses is illustrated by application of these analytical techniques to data taken from the literature.

...read moreread less

Journal Article•

Lack of Time-Space Clustering of Childhood Leukemia in Los Angeles County, 1960–1964

[...]

Andrew G. Glass, Nathan Mantel

01 Nov 1969-Cancer Research

TL;DR: The analyses revealed no significant or suggestive indication of time-space clustering, whether applied to deaths at age 0–14, 0–5, or 2–9 years, and the absence of clustering is interpreted as suggesting that, if leukemia is due to an infectious agent, then it is one to which humans are highly resistant.

...read moreread less

Abstract: Dates and residence data for 298 Los Angeles childhood leukemia deaths during the period 1960–64 were analyzed for time-space clustering by 2 approaches. By the Knox approach, 2 cases are considered to be close neighbors if they are within both a specified spatial and a specified temporal distance of each other. A wide spectrum of critical distances, both for time and space, was used in identifying close neighbors. By the Mantel approach, a correlation- or regression-type analysis is made to see if the reciprocal of the spatial separation between any 2 cases is related to the reciprocal of the absolute temporal separation. The latter approach requires increasing each separation by a suitable additive constant prior to taking reciprocals, and for this purpose a wide spectrum of additive constants was employed relative to each type of separation. Significance tests for both approaches were made by the permutational procedure described by Mantel. This procedure permits one to obtain the expectation and variance of the statistics yielded by either the Knox or Mantel approach under all possible random pairings of the reported dates of death and places of residence. The analyses revealed no significant or suggestive indication of time-space clustering, whether applied to deaths at age 0–14, 0–5, or 2–9 years. The absence of clustering is interpreted as suggesting that, if leukemia is due to an infectious agent, then it is one to which humans are highly resistant. Some interesting insights were obtained from the behavior of the statistical procedures employed.

...read moreread less

Journal Article•DOI•

Rhyme as a determinant of clustering

[...]

W. A. Bousfield¹, David A. Wicklund¹•Institutions (1)

University of Connecticut¹

01 Apr 1969-Psychonomic science

TL;DR: The authors dealt with rhyme as a determinant of clustering in free recall and showed significan clustering as well as high variance attribu table to both Ss and word pairs.

...read moreread less

Abstract: This study dealt with rhyme as a determinant of clustering in free recall. The stimuli comprised 12 rhyming pairs of words. The results from 30 Ss showed significan clustering as well as high variance attribu table to both Ss and word pairs.

...read moreread less

Journal Article•DOI•

Clustering Effects in Cu-Ni

[...]

Stephen Moss¹•Institutions (1)

Massachusetts Institute of Technology¹

18 Aug 1969-Physical Review Letters

TL;DR: While concentrated Cu-Ni alloys show local correlations characteristic of clustering, it is re-emphasized, in light of the recent claim of Kidron, that normal solution heat treatment of them (which includes nearly any cooling procedure) yields a small deviation from randomness of the atomic arrangements and no Guinier-zone formation or phase separation.

...read moreread less

Abstract: While concentrated Cu-Ni alloys show local correlations characteristic of clustering, it is re-emphasized, in light of the recent claim of Kidron, that normal solution heat treatment of them (which includes nearly any cooling procedure) yields a small deviation from randomness of the atomic arrangements and no Guinier-zone formation or phase separation.

...read moreread less

Journal Article•

Transfer of learning in associative clustering of retardates and normals.

[...]

Irma R. Gerjuoy, Jose M. Alvarez

01 Mar 1969-American journal of mental deficiency

Journal Article•DOI•

Reduction of clustering problem to pattern recognition

[...]

Tsuguchika Kaminuma¹, Tadao Takekawa¹, Satosi Watanabe¹•Institutions (1)

University of Hawaii¹

01 Mar 1969-Pattern Recognition

TL;DR: It is shown to be effective to select a small number of “representative” objects first and to apply the clustering program on them, and to place them in the generated classes by the pattern recognition technique.

...read moreread less

Journal Article•DOI•

Acute leukaemia in New England. An investigation into the clustering of cases in time and place.

[...]

M Merrington, C C Spicer

01 May 1969-Journal of Epidemiology and Community Health

TL;DR: The statistical analysis on some 540 cases of acute leukaemia occurring in Maine, Massachusetts, New Hampshire, and Vermont finds evidence for clustering of cases in place and time.

...read moreread less

Abstract: The problem of discovering evidence for clustering of cases in place and time has only recently been investigated by epidemiologists and statisticians. Acute leukaemia has sometimes been described as a disease which does occur in clusters (Heath and Hasterlik, 1963; Kellett, 1937; Knox, 1964; Mainwaring, 1966; Meighan and Knox, 1965). Conversely, it has also been reported that the disease does not occur in clusters (Ager, Schuman, Wallace, Rosenfield, and Gullen, 1965; Barton, David, and Merrington, 1965; Clemmesen, Busk and Nielsen, 1952; Ederer, Myers and Mantel, 1964; Ederer, Myers, Eisenberg, and Campbell, 1965; Lock and Merrington, 1967; Lundin, Fraumeni, Lloyd, and Smith, 1966; Stark and Mantel 1967). We report here the statistical analysis on some 540 cases of acute leukaemia occurring in Maine, Massachusetts, New Hampshire, and Vermont. The records of acute leukaemia in Connecticut were not available to us but they have been described elsewhere (Ederer et al., 1965).

...read moreread less

Journal Article•DOI•

On Alpha-Clustering in Nuclear Matter

[...]

Yoshinori Akaishi¹, Hiroharu Bandō¹•Institutions (1)

Kyoto University¹

01 Jun 1969-Progress of Theoretical Physics

Journal Article•DOI•

Sequential Algorithm for the Design of Piecewise Linear Classifiers

[...]

Roy Louis Hoffman¹, Maynard L. Moe²•Institutions (2)

IBM¹, University of Denver²

01 Apr 1969-IEEE Transactions on Systems Science and Cybernetics

TL;DR: A sequential algorithm for designing piecewise linear classification functions without a priori knowledge of pattern class distributions is described that combines adaptive error correcting linear classifier design procedures and clustering techniques under control of a performance criterion.

...read moreread less

Abstract: A sequential algorithm for designing piecewise linear classification functions without a priori knowledge of pattern class distributions is described. The algorithm combines adaptive error correcting linear classifier design procedures and clustering techniques under control of a performance criterion. The classification function structure is constrained to minimize design calculations and increase recognition through-put for many classification problems. Examples from the literature are used to evaluate this approach relative to other classification algorithms.

...read moreread less

Journal Article•

Improving Scalability, Sparsity and Cold Start user Issues in Collaborative Tagging with Incremental Clustering and Trust

[...]

Latha Banda and Karan Singh

31 Dec 1969-Recent Patents on Computer Science

TL;DR: A method Collaborative Tagging (CT) with incremental clustering and Trust is proposed which enhances the recommendation quality by removing the issues of scalability with the help of Incremental Clustering and sparsity and cold start user or item problems are resolved with theHelp of Trust.

...read moreread less

Abstract: Due to huge data in web sites, recommending users for every product is impossible. For this problem Recommender Systems (RS) are introduced. RS is categorized into Content-Based (CB), collaborative Filtering (CF) and Hybrid RS. Based on these techniques recommendations are done to user. In this, CF is the recent technique used in RS in which tagging feature also provided. Three main issues occur in RS are scalability problem which occurs when there is a huge data, sparsity problem occurs when rating data is missing and cols start user or item problem occurs when new user or new item enters in the system. To avoid these issues here we have proposed Incremental clustering and Trust in Collaborative Tagging. Here we have proposed a method Collaborative Tagging (CT) with Incremental Clustering and Trust which enhances the recommendation quality by removing the issues of scalability with the help of Incremental Clustering and sparsity and cold start user or item problems are resolved with the help of Trust. Here we have compared the results of Collaborative tagging with Incremental Clustering and Trust (CFT-EDIC-TS) with the baseline approaches of CT with Cosine similarity (CFT-CS), CT with Euclidian Distance and Incremental Clustering (CFT-EDIC) and CT with Trust (CFT-TS). Here we have compare the proposed approach with the baseline approaches and the metrics are used MAE, prediction percentage, Precision and Recall. Based on these metrics for every split CFT-EDICTS shown best results as compared to other baseline approaches.

...read moreread less

Journal Article•DOI•

Clustering in free recall based upon input contiguity

[...]

William P. Wallace¹•Institutions (1)

University of Nevada, Reno¹

01 Jun 1969-Psychonomic science

TL;DR: In this paper, a list of 16 words was exposed for one study trial in a modified free-recall experiment, where critical words in the list were paired randomly and presented three times each.

...read moreread less

Abstract: A list of 16 words was exposed for one study trial in a modified free-recall experiment. Critical words in the list were paired randomly and presented three times each. For one group (E) the two members of common pairs always appeared in successive positions during the study trial.. For the other group (C) members of the predetermined random pairs were never presented successively. Clustering scores in recall based on the predetermined random pairings were significantly higher in Group E than in Group C. It was concluded that adjacency relations during the study trial provided a sufficient basis for clustering during recall.

...read moreread less

Journal Article•

Clustering Method for categorical and Numeric Data sets

[...]

Simmi Bagga, G. N. Singh

31 Dec 1969-Global journal of computer science and technology

TL;DR: The stub based clustering approach reduces computation time over a traditional clustering and also increases its efficiency.

...read moreread less

Abstract: Many issues concerned with clustering process are due to large datasets involves. In clustering computation become expensive when there are large data sets involved and work efficiently when there is limited number of cluster with relatively small data set. This paper will present a new technique for clustering for large datasets. That will work efficiently equally with large data set as well as with small data sets. The main idea behind this method is to divide the whole process in two steps. The first step uses a cheap approximate distance measure that divide the data into overlapped subsets we call it stubs. Then in second step clustering is performed for measuring exact distances only between points that occur in common stubs. The stub based clustering approach reduces computation time over a traditional clustering and also increases its efficiency.

...read moreread less

Journal Article•DOI•

Clustering of related but nonassocfated Items In free recall

[...]

Joseph F. Fagan¹•Institutions (1)

Case Western Reserve University¹

01 Feb 1969-Psychonomic science

TL;DR: The authors observed reliable category clustering in free recall of words which rhymed but which did not elicit one another as free associates, and found that intrusions were phonemically similar to list items.

...read moreread less

Abstract: Reliable category clustering was observed in the free recall of words which rhymed but which did not elicit one another as free associates. Intrusions were phonemically similar to list items.

...read moreread less

Vapor phase clustering model for water

[...]

Richard Wayne Bolander

01 Jan 1969

Semantic Differential Relationships as a Determinant of Clustering

[...]

Burr R. Beckwith

01 Jan 1969

Journal Article•DOI•

Mathematical Modeling and Fractal Clustering Algorithm in Online Shopping

[...]

Shanhui Sun, Hong Li, Zhuangzhuang Li, Bingqiu Zhang

31 Dec 1969-Journal of Networks

TL;DR: The paper establishes the mathematical model of fractal clustering, and uses the fractal dimension to describe and depict Fractal Company, and conducts customer segmentation by using fractal theory and clustering analysis technology, in order to dig out the most valuable customers and potential customers.

...read moreread less

Abstract: This paper puts forward the concept of fractal company is in the process of online shopping. According to the characteristics of the network shopping, the paper establishes the mathematical model of fractal clustering, and uses the fractal dimension to describe and depict Fractal Company. Online shopping is actually the management of the entire supply chain. Based on similar structure of fractal supply chain, the article carries out mathematical model of the fractal supply chain by using the dissipative structure theory and entropy theory. At last the paper conducts customer segmentation by using the fractal theory and clustering analysis technology, in order to dig out the most valuable customers and potential customers

...read moreread less

Journal Article•DOI•

Category clustering in incidental learning.

[...]

Fred Shima

01 Jan 1969-Journal of Experimental Psychology

An axiomatic basis and computational methods for optimal clustering.

[...]

Mario Padron

01 Mar 1969

TL;DR: This report is concerned with the problem of classifying objects into clusters in such a way that objects within the same cluster are alike and objects in different clusters are relatively dissimilar.

...read moreread less

Abstract: : This report is concerned with the problem of classifying objects into clusters in such a way that objects within the same cluster are alike and objects in different clusters are relatively dissimilar. A distance or measure of similarity is required in order to measure the degree of likeness of similarity existing between any pair of objects. A clustering criterion or measure of the goodness of any given allocation is developed from basic postulates which attempt to quantify the notions of within group homogeneity and between group heterogeneity. Basic mathematical and experimental properties of the clustering criterion are demonstrated and illustrated. The problem is then imbedded into a mathematical programming formulation which permits the theoretical development of a computational algorithm which converges to an optimal solution for problems of a limited size. With a significant contribution from the algorithm a heuristic method is developed to facilitate the use of the technique for larger problems with a great increase in speed and a very small reduction in accuracy. Several examples are presented to illustrate the properties of the two computational methods developed. Finally, the work presented is compared with other major contributions in this field and suggestions for further research are given. The appendices include three of the major computer programs developed, together with an outline of the problem of grouping objects to minimize an interaction cost, which could be considered a special case of the clustering problem. (Author)

...read moreread less

Journal Article•DOI•

Voice Classification II: Extension of Sample and Language Independence

[...]

Herman R. Silbiger

01 Jan 1969-Journal of the Acoustical Society of America

TL;DR: In this paper, a multidimensional scaling analysis, parametric mapping, was applied to the semantic differential data and the results showed that the ratings on the semantic difference represent the perception of the voice quality and are not de...

...read moreread less

Abstract: Additional data have been obtained on the perceptual classification of voices by means of the voice classification by hierarchical clustering method [J. Acoust. Soc. Amer. 40, 1282 (A) (1966)]. The voice sample has been extended to 100 male and 100 female voices. The method appears to yield stable results since the gross structure of the clustering remains invariant. The semantic differential data were also submitted to a multidimensional scaling analysis, parametric mapping, which provided an excellent three‐dimensional fit. To investigate any dependence of the classification method on the speaker's language, 20 males whose native language was other than English were recorded in their native language and in English. The clustering analysis showed closely comparable results for the voices speaking either English or the foreign language, with 13 languages represented. This provides additional evidence that the ratings on the semantic differential represent the perception of the voice quality and are not de...

...read moreread less