scispace - formally typeset
Open AccessProceedings Article

Convergence Properties of the K-Means Algorithms

Léon Bottou, +1 more
- Vol. 7, pp 585-592
Reads0
Chats0
TLDR
It is shown that the K-Means algorithm actually minimizes the quantization error using the very fast Newton algorithm.
Abstract
This paper studies the convergence properties of the well known K-Means clustering algorithm. The K-Means algorithm can be described either as a gradient descent algorithm or by slightly extending the mathematics of the EM algorithm to this hard threshold case. We show that the K-Means algorithm actually minimizes the quantization error using the very fast Newton algorithm.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

An efficient k-means clustering algorithm: analysis and implementation

TL;DR: This work presents a simple and efficient implementation of Lloyd's k-means clustering algorithm, which it calls the filtering algorithm, and establishes the practical efficiency of the algorithm's running time.
Book ChapterDOI

A Survey of Clustering Data Mining Techniques

TL;DR: This survey concentrates on clustering algorithms from a data mining perspective as a data modeling technique that provides for concise summaries of the data.
Proceedings ArticleDOI

Learning methods for generic object recognition with invariance to pose and lighting

TL;DR: A real-time version of the system was implemented that can detect and classify objects in natural scenes at around 10 frames per second and proved impractical, while convolutional nets yielded 16/7% error.
Proceedings ArticleDOI

Web-scale k-means clustering

TL;DR: This work proposes the use of mini-batch optimization for k-means clustering, which reduces computation cost by orders of magnitude compared to the classic batch algorithm while yielding significantly better solutions than online stochastic gradient descent.
Journal ArticleDOI

A comparative study of efficient initialization methods for the k-means clustering algorithm

TL;DR: It is demonstrated that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods, and eight commonly used linear time complexity initialization methods are compared.
References
More filters

Some methods for classification and analysis of multivariate observations

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Book

Self Organization And Associative Memory

Teuvo Kohonen
TL;DR: The purpose and nature of Biological Memory, as well as some of the aspects of Memory Aspects, are explained.
Proceedings Article

Note on Learning Rate Schedules for Stochastic Optimization

TL;DR: "search-then-converge" type schedules which outperform the classical constant and "running average" (1/t) schedules both in speed of convergence and quality of solution.