scispace - formally typeset
Journal ArticleDOI

A randomized algorithm for two-cluster partition of a set of vectors

Reads0
Chats0
TLDR
In this article, a randomized algorithm for strongly NP-hard problem of partitioning a finite set of vectors of Euclidean space into two clusters of given sizes according to the minimum of the sum-of-squared-distances criterion is substantiated.
Abstract
A randomized algorithm is substantiated for the strongly NP-hard problem of partitioning a finite set of vectors of Euclidean space into two clusters of given sizes according to the minimum-of-the sum-of-squared-distances criterion. It is assumed that the centroid of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The centroid of the other cluster is fixed at the origin. For an established parameter value, the algorithm finds an approximate solution of the problem in time that is linear in the space dimension and the input size of the problem for given values of the relative error and failure probability. The conditions are established under which the algorithm is asymptotically exact and runs in time that is linear in the space dimension and quadratic in the input size of the problem.

read more

Citations
More filters

Minimum Sum of Squares Clustering in a Low Dimensional Space

TL;DR: An exact polynomial algorithm, with a complexity in O(Np+1 logN), is proposed for minimum sum of squares hierarchical divisive clustering of points in a p-dimensional space with small p.
Journal ArticleDOI

Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem

TL;DR: The strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters of given sizes (cardinalities) minimizing the sum (over both clusters) of the intracluster sums of squared distances from the elements of the clusters to their centers is considered in this paper.
Journal ArticleDOI

Polynomial-time approximation scheme for a problem of partitioning a finite set into two clusters

TL;DR: This work considers the strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters of given cardinalities under the minimum criterion for the sum over the clusters of the intracluster sums of squared distances from elements of the cluster to its center.
Journal ArticleDOI

An exact pseudopolynomial algorithm for a problem of the two-cluster partitioning of a set of vectors

TL;DR: It is proved that, for a fixed dimension of the space, the problem of partitioning a set of Euclidean vectors into two clusters of given sizes is solvable in polynomial time.
Book ChapterDOI

A Fully Polynomial-Time Approximation Scheme for a Special Case of a Balanced 2-Clustering Problem

TL;DR: An approximation algorithm is presented for the strongly NP-hard problem of partitioning a set of Euclidean points into two clusters and it is proved that it is a fully polynomial-time approximation scheme when the space dimension is bounded by a constant.
References
More filters

Some methods for classification and analysis of multivariate observations

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Journal ArticleDOI

Data clustering: 50 years beyond K-means

TL;DR: A brief overview of clustering is provided, well known clustering methods are summarized, the major challenges and key issues in designing clustering algorithms are discussed, and some of the emerging and useful research directions are pointed out.
Book

Randomized Algorithms

TL;DR: This book introduces the basic concepts in the design and analysis of randomized algorithms and presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications.

Randomized Algorithms

TL;DR: For many applications, a randomized algorithm is either the simplest or the fastest algorithm available, and sometimes both. as discussed by the authors introduces the basic concepts in the design and analysis of randomized algorithms and provides a comprehensive and representative selection of the algorithms that might be used in each of these areas.

NP-Hardness of Euclidean Sum-of-Squares Clustering

TL;DR: In this paper, an alternate short proof of NP-hardness of Euclidean sum-of-squares clustering is provided. But this proof is not valid for the general case.