Better Guarantees for k-Means and Euclidean k-Median by Primal-Dual Algorithms

doi:10.1109/FOCS.2017.15

Open AccessProceedings ArticleDOI

Better Guarantees for k-Means and Euclidean k-Median by Primal-Dual Algorithms

- pp 61-72

TLDR

A new primal-dual approach is presented that allows to exploit the geometric structure of k-means and to satisfy the hard constraint that at most k clusters are selected without deteriorating the approximation guarantee.

Abstract:

Clustering is a classic topic in optimization with k-means being one of the most fundamental such problems. In the absence of any restrictions on the input, the best known algorithm for k-means with a provable guarantee is a simple local search heuristic yielding an approximation guarantee of 9+≥ilon, a ratio that is known to be tight with respect to such methods.We overcome this barrier by presenting a new primal-dual approach that allows us to (1) exploit the geometric structure of k-means and (2) to satisfy the hard constraint that at most k clusters are selected without deteriorating the approximation guarantee. Our main result is a 6.357-approximation algorithm with respect to the standard LP relaxation. Our techniques are quite general and we also show improved guarantees for the general version of k-means where the underlying metric is not required to be Euclidean and for k-median in Euclidean metrics.

Citations

PDF

Open Access

More filters

Book ChapterDOI

Fair Coresets and Streaming Algorithms for Fair k-means

Melanie Schmidt, +2 more

TL;DR: In this paper, the authors proposed a streaming PTAS for fair k-means in the case of two colors (and exact balances), which leads to a constant factor algorithm in the streaming model when combined with the coreset.

...read moreread less

Proceedings ArticleDOI

Constant approximation for k-median and k-means with outliers via iterative rounding

Ravishankar Krishnaswamy, +2 more

TL;DR: For k-means with outliers, Chen et al. as discussed by the authors gave an O(1)-approximation algorithm for matroid and knapsack median problems, which is the best known approximation algorithm for k-median with outlier.

...read moreread less

Journal ArticleDOI

Spectral rotation for deep one-step clustering

Xiaofeng Zhu, +2 more

- 01 Sep 2020 -

Pattern Recognition

TL;DR: A deep spectral clustering method which embeds four parts in a unified framework with the following advantages, and develops a two-task deep clustering model with linear activation functions to output effective clustering result.

...read moreread less

Proceedings ArticleDOI

Socially Fair k-Means Clustering

Mehrdad Ghadiri, +2 more

TL;DR: It is found that on benchmark datasets, Fair-Lloyd exhibits unbiased performance by ensuring that all groups have equal costs in the output k-clustering, while incurring a negligible increase in running time, thus making it a viable fair option wherever k-means is currently used.

...read moreread less

Proceedings Article

(Individual) Fairness for k-Clustering

Sepideh Mahabadi, +1 more

TL;DR: The $k-median ($k-means) cost of the solution is within a constant factor of the cost of an optimal fair $k$-clustering, and the solution approximately satisfies the fairness condition.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Least squares quantization in PCM

S. P. Lloyd

- 01 Mar 1982 -

IEEE Transactions on Information Theory

TL;DR: In this article, the authors derived necessary conditions for any finite number of quanta and associated quantization intervals of an optimum finite quantization scheme to achieve minimum average quantization noise power.

...read moreread less

Least Squares Quantization in PCM

S. P. Lloyd

TL;DR: The corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy.

...read moreread less

Proceedings ArticleDOI

k-means++: the advantages of careful seeding

David Arthur, +1 more

TL;DR: By augmenting k-means with a very simple, randomized seeding technique, this work obtains an algorithm that is Θ(logk)-competitive with the optimal clustering.

...read moreread less

Book

Approximation Algorithms

Vijay V. Vazirani

TL;DR: Covering the basic techniques used in the latest research work, the author consolidates progress made so far, including some very recent and promising results, and conveys the beauty and excitement of work in the field.

...read moreread less

Book ChapterDOI

Data Clustering: 50 Years Beyond K-means

Anil K. Jain

TL;DR: Cluster analysis as mentioned in this paper is the formal study of algorithms and methods for grouping objects according to measured or perceived intrinsic characteristics, which is one of the most fundamental modes of understanding and learning.

...read moreread less