Journal ArticleDOI
A randomized algorithm for two-cluster partition of a set of vectors
Reads0
Chats0
TLDR
In this article, a randomized algorithm for strongly NP-hard problem of partitioning a finite set of vectors of Euclidean space into two clusters of given sizes according to the minimum of the sum-of-squared-distances criterion is substantiated.Abstract:
A randomized algorithm is substantiated for the strongly NP-hard problem of partitioning a finite set of vectors of Euclidean space into two clusters of given sizes according to the minimum-of-the sum-of-squared-distances criterion. It is assumed that the centroid of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The centroid of the other cluster is fixed at the origin. For an established parameter value, the algorithm finds an approximate solution of the problem in time that is linear in the space dimension and the input size of the problem for given values of the relative error and failure probability. The conditions are established under which the algorithm is asymptotically exact and runs in time that is linear in the space dimension and quadratic in the input size of the problem.read more
Citations
More filters
Minimum Sum of Squares Clustering in a Low Dimensional Space
TL;DR: An exact polynomial algorithm, with a complexity in O(Np+1 logN), is proposed for minimum sum of squares hierarchical divisive clustering of points in a p-dimensional space with small p.
Journal ArticleDOI
Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem
TL;DR: The strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters of given sizes (cardinalities) minimizing the sum (over both clusters) of the intracluster sums of squared distances from the elements of the clusters to their centers is considered in this paper.
Journal ArticleDOI
Polynomial-time approximation scheme for a problem of partitioning a finite set into two clusters
TL;DR: This work considers the strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters of given cardinalities under the minimum criterion for the sum over the clusters of the intracluster sums of squared distances from elements of the cluster to its center.
Journal ArticleDOI
An exact pseudopolynomial algorithm for a problem of the two-cluster partitioning of a set of vectors
TL;DR: It is proved that, for a fixed dimension of the space, the problem of partitioning a set of Euclidean vectors into two clusters of given sizes is solvable in polynomial time.
Book ChapterDOI
A Fully Polynomial-Time Approximation Scheme for a Special Case of a Balanced 2-Clustering Problem
Alexander Kel'manov,Anna Motkova +1 more
TL;DR: An approximation algorithm is presented for the strongly NP-hard problem of partitioning a set of Euclidean points into two clusters and it is proved that it is a fully polynomial-time approximation scheme when the space dimension is bounded by a constant.
References
More filters
Some methods for classification and analysis of multivariate observations
TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Journal ArticleDOI
Data clustering: 50 years beyond K-means
TL;DR: A brief overview of clustering is provided, well known clustering methods are summarized, the major challenges and key issues in designing clustering algorithms are discussed, and some of the emerging and useful research directions are pointed out.
Book
Randomized Algorithms
TL;DR: This book introduces the basic concepts in the design and analysis of randomized algorithms and presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications.
Randomized Algorithms
TL;DR: For many applications, a randomized algorithm is either the simplest or the fastest algorithm available, and sometimes both. as discussed by the authors introduces the basic concepts in the design and analysis of randomized algorithms and provides a comprehensive and representative selection of the algorithms that might be used in each of these areas.
NP-Hardness of Euclidean Sum-of-Squares Clustering
TL;DR: In this paper, an alternate short proof of NP-hardness of Euclidean sum-of-squares clustering is provided. But this proof is not valid for the general case.