What have the authors stated for future works in "A simple linear time (1 + ε)-approximation algorithm for k-means clustering in any dimensions" ?

An interesting direction for further research is to extend their methods for other clustering problems.

How can the authors approximate the cost of the optimal k-means clustering?

Using the notion of balanced clusters in conjunction with Lemma 2.2, by eliminating at most (1+µ)γ|P | outliers, the authors can approximate the cost of the optimal k-means clustering with at most γ|P | outliers.

Why is it necessary to compute the 2-means solution in iteration i?

This is needed because when the authors find a candidate c′2 in iteration i + 1, the authors need to compute the 2-means solution when all points in P −Q′i are assigned to c′1 and the points in Q ′ i are assigned to the nearer of c′1 and c ′ 2.

What is the way to get a good approximation to the optimal 1-me?

if the authors choose m as 2ε , then with probability at least 1/2, the authors get a (1 + ε)-approximation to ∆1(P ) by taking the center as the centroid of T .

What is the cost of assigning points to the centers in C?

Algorithm Irred-k-means(Q, m, k, C, α, Sum) Inputs Q: Remaining point setm: number of cluster centers yet to be found k: total number of clusters C: set of k − m cluster centers found so far α: approximation factor Sum: the cost of assigning pointsin P − Q to the centers in C Output The clustering of the points in Q in k clusters.

What is the kmeans cost of a set of k points?

Given a set of k points K , which the authors also denote as centers, define the k−means cost of P with respect to K , ∆(P, K), as∆(P, K) = ∑p∈Pd(p, K)2,where d(p, K) denotes the distance between p and the closest point to p in K .

(Open Access) A simple linear time (1 + /spl epsiv/)-approximation algorithm for k-means clustering in any dimensions (2004) | Amit Kumar

Q: What have the authors contributed in "A simple linear time (1 + ε)-approximation algorithm for k-means clustering in any dimensions" ?

The authors present the first linear time ( 1+ε ) -approximation algorithm for the k-means problem for fixed k and ε.

Q: What is the widely studied problem in computer science?

The problem of clustering a group of data items into similar groups is one of the most widely studied problems in computer science.

Q: What is the problem to get a polynomial time?

it is an open problem to get a polynomial time (1 + ε)approximation algorithm for the k-means clustering problem when n, k and d are not constants.

Q: What is the description of the kmeans problem?

some work has been devoted to finding (1 + ε)-approximation algorithms for the kmeans problem, where ε can be an arbitrarily small constant.

Q: What is the common definition of clustering?

Most of these definitions begin by defining a notion of distance between two data items and then try to form clusters so that data items with small distance between them get clustered together.

A Simple Linear Time (1 + ε)-Approximation Algorithm for k-Means Clustering

in Any Dimensions

Amit Kumar

Dept. of Computer Science

& Engg., IIT Delhi

New Delhi-110016, India

amitk@cse.iitd.ernet.in

Yogish Sabharwal

IBM India Research Lab

Block-I, IIT Delhi,

New Delhi-110016, India

ysabharwal@in.ibm.com

Sandeep Sen

Dept. of Computer Science

& Engg., IIT Delhi

New Delhi-110016, India

ssen@cse.iitd.ernet.in

Abstract

We present the ﬁrst linear time (1+ε)-approximation al-

gorithm for the k-means problem for ﬁxed k and ε . Our al-

gorithm runs in O(nd) time, which is linear in the size of

the input. Another feature of our algorithm is its simplic-

ity – the only technique involved is random sampling.

1. Introduction

The problem of clustering a group of data items into sim-

ilar groups is one of the most widely studied problems in

computer science. Clustering has applications in a variety of

areas, for example, data mining, information retrieval, im-

age processing, and web search ([5, 7, 14, 9]). Given the

wide range of applications, many different deﬁnitions of

clustering exist in the literature ([8, 4]). Most of these deﬁ-

nitions begin by deﬁning a notion of distance between two

data items and then try to form clusters so that data items

with small distance between them get clustered together.

Often, clustering problems arise in a geometric setting,

i.e., the data items are points in a high dimensional Eu-

clidean space. In such settings, it is natural to deﬁne the

distance between two points as the Euclidean distance be-

tween them. One of the most popular deﬁnitions of cluster-

ing is the k-means clustering problem. Given a set of points

P , the k-means clustering problems seeks to ﬁnd a set K of

k centers, such that

p∈P

d(p, K)

is minimized. Note that the points in K can be arbitrary

points in the Euclidean space. Here d(p, K) refers to the dis-

tance between p and the closest center in K. We can think

1 Author’s present address: Dept of Computer Science and Engneering,

IIT Kharagpur 721302.

of this as each point in P gets assigned to the closest cen-

ter in K. The points that get assigned to the same center

form a cluster. The k-means problem is NP-hard even for

k = 2. Another popular deﬁnition of clustering is the k-

median problem. This is deﬁned in the same manner as the

k-means problem except for the fact that the objective func-

tion is

p∈P

d(p, K). Observe that the distance measure

used in the deﬁnition of the k-means problem is not a met-

ric. This might lead one to believe that solving the k-means

problem is more difﬁcult than the k -median problem. How-

ever, in this paper, we give strong evidence that this may not

be the case.

A lot of research has been devoted to solving the k-

means problem exactly (see [11] and the references therein).

Even the best known algorithms for this problem take at

least Ω(n

) time. Recently, some work has been devoted

to ﬁnding (1 + ε)-approximation algorithms for the k-

means problem, where ε can be an arbitrarily small con-

stant. This has led to algorithms with much improved run-

ning time. Further, if we look at the applications of the k-

means problem, they often involve mapping subjective fea-

tures to points in the Euclidean space. Since there is an error

inherent in this mapping, ﬁnding a (1 + ε)-approximate so-

lution does not lead to a deterioration in the solution for the

actual application.

In this paper, we give the ﬁrst truly linear time (1 + ε)-

approximation algorithm for the k-means problem. Treat-

ing k and ε as constants, our algorithm runs in O(nd) time,

which is linear in the size of the input. Another feature of

our algorithm is its simplicity – the only technique involved

is random sampling.

1.1. Related work

The fastest exact algorithm for the k-means clustering

problem was proposed by Inaba et al. [11]. They observed

that the number of Voronoi partitions of k points in <

O(n

) and so the optimal k-means clustering could be de-

termined exactly in time O(n

kd+1

). They also proposed

a randomized (1 + ε)-approximation algorithm for the 2-

means clustering problem with running time O(n/ε

Matousek [13] proposed a deterministic (1 + ε)-

approximation algorithm for the k-means problem

with running time O(nε

−2k

log

n). Badoiu et al.

[3] proposed a (1 + ε)-approximation algorithm for

the k-median clustering problem with running time

O(2

(k/ε)

O(1)

nlog

O(k)

n). Their algorithm can be ex-

tended to get a (1 + ε)-approximation algorithm for the

k-means clustering problem with a similar running time. de

la Vega et al. [6] proposed a (1 + ε)-approximation algo-

rithm for the k-means problem which works well for points

in high dimensional points in high dimensions. The run-

ning time of this algorithm is O(g(k, ε)nlog

n) where

g(k, ε) = exp[(k

/ε

)(ln(k/ ε)lnk]. Recently, Har-Peled

et al. [10] proposed a (1 + ε)-approximation algo-

rithm for the k-means clustering whose running time

is O(n + k

k+2

−(2d+1)k

log

k+1

nlog

). Their algo-

rithm is also fairly complicated and relies on several

results in computational geometry that depend exponen-

tially on the number of dimensions. So this is more suitable

for low dimensions only.

There exist other deﬁnitions of clustering, for example,

k-median clustering where the objective is to minimize the

sum of the distances to the nearest center and k-center clus-

tering, where the objective is to minimize the maximum dis-

tance (see [1, 2, 3, 10, 12] and references therein).

1.2. Our contributions

We present a linear time (1 + ε)-approximation algo-

rithm for the k-means problem. Treating k and ε as con-

stants, the running time of our algorithm is better in com-

parison to the previously known algorithms for this prob-

lem. However, the algorithm due to Har-Peled and Mazum-

dar [10] deserves careful comparison. Note that their algo-

rithm, though linear in n, is not linear in the input size of

the problem, which is dn (for n points in d dimensions).

Therefore, their algorithm is better only for low dimen-

sions; for d = Ω(log n), our algorithm is much faster. Even

use of Johnson-Lindenstraus lemma will not make the run-

ning time comparable as it has its own overheads. Many re-

cent algorithms rely on techniques like exponential grid or

scaling that have high overheads. For instance, normaliz-

ing with respect to minimum distance between points may

incur an extra Ω (n) cost per point depending on the com-

putational model. In [3], the authors have used rounding

techniques based on approximations of the optimal k-center

value without specifying the cost incurred in the process.

The techniques employed in our algorithm have no such

hidden overheads.

The 2-means clustering problem has also gener-

ated enough research interest in the past. Our algo-

rithm yields a (1 + ε)-approximation algorithm for the

2-means clustering problem with constant probabil-

ity in time O(2

(1/ε)

O(1)

dn). This is the ﬁrst dimension

independent (in the exponent) algorithm for this prob-

lem that runs in linear time.

The basic idea of our algorithm is very simple. We be-

gin with the observation of Inaba et. al. [11] that given a set

of points, their centroid can be very well approximated by

sampling a constant number of points and ﬁnding the cen-

troid of this sample. So if we knew the clusters formed by

the optimal solution, we can get good approximations to the

actual centers. Of course, we do not know this fact. How-

ever, if we sample O (k) points, we know that we will get

a constant number of points from the largest cluster. Thus,

by trying all subsets of constant size from this sample, we

can essentially sample points from the largest cluster. In this

way, we can estimate the centers of large clusters. How-

ever, in order to sample from the smaller clusters, we need

to prune points from the larger clusters. This pruning has

to balance two facts – we would not like to remove points

from the smaller clusters and yet we want to removeenough

points from the larger clusters.

Our algorithm appears very similar in spirit to that of

Badiou et al. [3]. In fact both these algorithms begin with

the same premise of random sampling. However, in order

to sample from the smaller clusters, their algorithm has to

guess the sizes of the smaller clusters and the distances be-

tween clusters. This causes an O(log

n) multiplicative fac-

tor in the running time of their algorithm. We completely

avoid this extra factor by a much more careful pruning al-

gorithm. Moreover this makes our algorithm considerably

simpler.

2. Preliminaries

Let P be a set of n points in the Euclidean space <

Given a set of k points K, which we also denote as centers,

deﬁne the k−means cost of P with respect to K, ∆(P, K),

∆(P, K) =

p∈P

d(p, K)

where d(p, K) denotes the distance between p and the

closest point to p in K. The k− means problem seeks to

ﬁnd a set

K of size k such that ∆(P, K) is minimized.

Let ∆

(P ) denote the cost of the optimal solution to the

k−means problem with respect to P .

1 In this paper we have addressed the unconstrained problem, where this

set can consist of any k points in <

If K happens to be a singleton set {y}, then we shall de-

note ∆(P, K) by ∆(P, y). Similar comments apply when

P is a singleton set.

Deﬁnition 2.1. We say that the point set P is (k, ε)-

irreducible if ∆

k−1

(P ) ≥ (1 + 32ε)∆

(P ). Otherwise we

say that the point set is (k, ε)-reducible.

Reducibility basically captures the fact that if instead of

ﬁnding the optimal k-means solution, we ﬁnd the optimal

(k − 1)-means solution, we will still be close to the former

solution. We now look at some properties of the 1-means

problem.

2.1. Properties of the 1-means problem

Deﬁnition 2.2. For a set of points P , deﬁne the centroid,

c(P ), of P as the point

p∈P

|P |

For any point x ∈ <

, it is easy to check that

∆(P, x) = ∆(P, c(P )) + |P | · ∆(c(P ), x). (1)

From this we can make the following observation.

Fact 2.1. Any optimal solution to the 1-means problem with

respect to an input point set P chooses c(P) as the center.

We can also deduce an important property of any opti-

mal solution to the k-means problem. Suppose we are given

an optimal solution to the k-means problem with respect

to the input P . Let K = {x

, . . . , x

} be the set of cen-

ters constructed by this solution. K produces a partitioning

of the point set P into K clusters, namely, P

, . . . , P

. P

is the set of points for which the closest point in K is x

In other words, the clusters correspond to the points in the

Voronoi regions in <

with respect to K. Now, Fact 2.1 im-

plies that x

must be the centroid of P

for all i.

Since we will be interested in fast algorithms for comput-

ing good approximations to the k-means problem, we ﬁrst

consider the case k = 1. Inaba et. al. [11] showed that the

centroid of a small random sample of points in P can be a

good approximation to c(P ).

Lemma 2.2. [11] Let T be a set of m points obtained by

independently sampling m points uniformly at random from

a point set P . Then, for any δ > 0,

∆(S, c(T )) <



1 +

δm



∆

(P )

holds with probability at least 1 − δ.

Therefore, if we choose m as

, then with probability at

least 1/2, we get a (1 + ε)-approximation to ∆

(P ) by tak-

ing the center as the centroid of T . Thus, a constant size

sample can quickly yield a good approximation to the opti-

mal 1-means solution.

Suppose P

is a subset of P and we want to get a good

approximation to the optimal 1-means for the point set P

Following lemma 2.2, we would like to sample from P

. But

the problem is that P

is not explicitly given to us. The fol-

lowing lemma states that if the size of P

is close to that of

P , then we can sample a slightly larger set of points from

P and hopefully this sample would contain enough random

samples from P

. Let us deﬁne things more formally ﬁrst.

Let P be a set of points and P

be a subset of P such that

| ≥ β|P |, where β is a constant between 0 and 1. Sup-

pose we take a sample S of size

βε

from P . Now we con-

sider all possible subsets of size

of S. For each of these

subsets S

, we compute its centroid c(S

), and consider this

as a potential center for the 1-means problem instance on

. In other words, we consider ∆(P

, c(S

)) for all such

subsets S

. The following lemma shows that one of these

subsets must give a close enough approximation to the op-

timal 1-means solution for P

Lemma 2.3. (Superset Sampling Lemma) The following

event happens with constant probability

min

⊂S,|S

∆(P

, c(S

)) ≤ (1 + ε)∆

)

Proof. With constant probability, S contains at least

points from P

. The rest follows from Lemma 2.2.

We use the standard notation B(p, r) to denote the open

ball of radius r around a point p.

We assume the input parameter ε for the approximation

factor satisﬁes 0 < ε≤1.

3. A linear time algorithm for 2-means clus-

tering

Before considering the k-means problem, we consider

the 2-means problem. This contains many of the ideas in-

herent in the more general algorithm. So it will make it eas-

ier to understand the more general algorithm.

Theorem 3.1. Given a point set P of size n in <

, there ex-

ists an algorithm which produces a (1 + ε )-approximation

to the optimal 2-means solution on the point set P with

constant probability. Further, this algorithm runs in time

O(2

(1/ε)

O(1)

dn).

Proof. Let α = ε/ 64. We can assume that P is (2, α)-

irreducible. Indeed suppose P is (2, α)-reducible. Then

∆

(P ) ≤ (1 + ε/2)∆

(P ). We can get a solution to

the 1-means problem for P by computing the centroid of

P in O(nd) time. The cost of this solution is at most

(1 + ε/2)∆

(P ). Thus we have shown the theorem if P

is (2, α)-reducible.

Consider an optimal 2-means solution for P . Let c

and

be the two centers in this solution. Let P

be the points

which are closer to c

than c

and P

be the points closer

to c

than c

. So c

is the centroid of P

and c

that of P

Without loss of generality, assume that |P

| ≥ |P

Since |P

| ≥ |P |/2, Lemma 2.3 implies that if we sam-

ple a set S of size O





from P and look at the set of

centroids of all subsets of S of size

, then at least one of

these centroids, call it c

has the property that ∆(P

, c

) ≤

(1 + α)∆(P

, c

). Since our algorithm is going to cycle

through all such subsets of S, we can assume that we have

found such a point c

Let the distance between c

and c

be t, i.e., d(c

, c

) =

Lemma 3.2. d(c

, c

) ≤ t/4.

Proof. Suppose d(c

, c

) > t/4. Equation (1) implies that

∆(P

, c

) − ∆(P

, c

) = |P

|∆(c

, c

) ≥

But we also know that left hand side is at most α∆(P

, c

Thus we get t

| ≤ 16α∆(P

, c

Applying Equation (1) once again, we see that

∆(P

, c

) = ∆(P

, c

) + t

| ≤ (1 + 16α)∆(P

, c

Therefore, ∆(P, c

) ≤ (1 + 16 α)∆(P

, c

) +

∆(P

, c

) ≤ (1 + 16α)∆

(P ). This contradicts the fact

that P is (2, α)-irreducible.

Now consider the ball B(c

, t/4). The previous lemma

implies that this ball is contained in the ball B(c

, t/2) of

radius t/2 centered at c

. So B(c

, t/4) is contained in P

Since we are looking for the point c

, we can delete the

points in this ball and hope that the resulting point set has a

good fraction of points from P

This is what we prove next. Let P

denote the point set

− B(c

, t/4). Let P

denote P

∪ P

. As we noted above

is a subset of P

Claim 3.3. |P

| ≥ α|P

Proof. Suppose not, i.e., |P

| ≤ α|P

|. Notice that

∆(P

, c

) ≥ ∆(P

, c

) ≥

Since ∆(P

, c

) ≤ (1 + α)∆(P

, c

), it follows that

| ≤ 16(1 + α)∆(P

, c

) (2)

So,

∆(P, c

) = ∆(P

, c

) + ∆(P

, c

)

= ∆(P

, c

) + ∆(P

, c

) + t

≤ ∆(P

, c

) + ∆(P

, c

)

+16α(1 + α)∆(P

, c

)

≤ (1 + 32α)∆(P

, c

) + ∆(P

, c

)

≤ (1 + 32α)∆

(P ),

where the second equation follows from (1), while third in-

equality follows from (2) and the fact |P

| ≤ α|P

|. But this

contradicts the fact that P is (2, α)-irreducible. This proves

the claim.

The above claim combined with Lemma 2.2 implies

that if we sample O





points from P

, and consider

the centroids of all subsets of size

in this sample, then

with constant probability we shall get a point c

for which

∆(P

, c

) ≤ (1 + α)∆(P

, c

). Thus, we get the centers c

and c

which satisfy the requirements of our lemma.

The only problem is that we do not know the value of

the parameter t. We will somehow need to guess this value

and yet maintain the fact that our algorithm takes only lin-

ear amount of time.

We can assume that we have found c

(this does not re-

quire any assumption on t). Now we need to sample from

(recall that P

is the set of points obtained by remov-

ing the points in P distant at most t/4 from c

). Suppose

we know the parameter i such that

≤ |P

| ≤

i−1

Consider the points of P in descending order of distance

from c

. Let Q

be the ﬁrst

i−1

points in this sequence. No-

tice that P

is a subset of Q

and |P

| ≥ |Q

|/2. Also we can

ﬁnd Q

in linear time (because we can locate the point at po-

sition

i−1

in linear time). Since |P

| ≥ α|P

|, we see that

| ≥ α|Q

|/2. Thus, Lemma 2.2 implies that it is enough

to sample O





points from Q

to locate c

(with con-

stant probability of course).

But the problem with this scheme is that we do not know

the value i. One option is try all possible values of i, which

will imply a running time of O(n log n) (treating the terms

involving α and d as constant). Also note that we cannot use

approximate range searching because preprocessing takes

O(nlogn) time.

We somehow need to combine the sampling and the idea

of guessing the value of i. Our algorithm proceeds as fol-

lows. It tries values of i in the order 0, 1, 2, . . .. In iteration

i, we ﬁnd the set of points Q

. Note that Q

i+1

is a subset of

. In fact Q

i+1

is the half of Q

which is farther from c

So in iteration (i+1), we can begin from the set of points Q

(instead of P

). We can ﬁnd the candidate point c

by sam-

pling from Q

i+1

. Thus we can ﬁnd Q

i+1

in time linear in

i+1

| only.

Further in iteration i, we also maintain the sum ∆(P −

, c

). Since ∆(P −Q

i+1

, c

) = ∆(P −Q

, c

)+∆(Q

−

i+1

, c

), we can compute ∆(P − Q

i+1

, c

) in iteration

i + 1 in time linear in Q

i+1

. This is needed because when

we ﬁnd a candidate c

in iteration i + 1, we need to com-

pute the 2-means solution when all points in P − Q

are as-

signed to c

and the points in Q

are assigned to the nearer

of c

and c

. We can do this in time linear in |Q

i+1

| if we

maintain the quantities ∆(P − Q

, c

) for all i.

Thus, we see that iteration i takes time linear in |Q

Since |Q

|’s decrease by a factor of 2, the overall running

time for a given value of c

is O(2

(1/α)

O(1)

dn). Since the

number of possible candidates for c

is O(2

(1/α)

O(1)

), the

running time is as stated.

Claim 3.4. The cost, ∆, reported by the algorithm satisﬁes

∆

(P )≤∆≤(1 + α)∆

(P ).

Proof. ∆

(P )≤∆ is obvious as we are associating each

point with one of the 2 centers being reported and accu-

mulating the corresponding cost. Now, consider the case

when we have the candidate center set where each center

is a (1 + α)-approximate centroid of it’s respective cluster.

As we are associating each point to the approximate cen-

troid of the corresponding cluster or a center closer than

it, it follows that ∆≤(1 + α)∆

(P ). If we report the min-

imum cost clustering, C, then since the actual cost of the

clustering (due to the corresponding Voronoi partitioning)

can only be better than the cost that we report (because we

associate some points with approximate centroids of cor-

responding cluster rather than the closest center), we have

∆(C)≤∆≤(1 + α)∆

(P ).

This proves the theorem.

4. A linear time algorithm for k-means clus-

tering

We now present the general k-means algorithm. We ﬁrst

present a brief outline of the algorithm.

4.1. Outline

Our algorithm begins on the same lines as the 2-means

algorithm. Again, we can assume that the solution is ir-

reducible, i.e., removing one of the centers does not cre-

ate a solution which has cost within a small factor of the

optimal solution. Consider an optimal solution which has

centers c

, . . . , c

and which correspondingly partitions the

point set P into clusters P

, . . . , P

. Assume that |P

| ≥

· · · ≥ |P

|. Our goal again will be to ﬁnd approximations

, . . . , c

to c

, . . . , c

respectively.

Suppose we have found centers c

, . . . , c

. Sup-

pose t is the distance between the closest pair of centroids

, . . . , c

} and {c

i+1

, . . . , c

}. As in the case of k = 2,

we can show that the points at distant at most t /4 from

, . . . , c

} get assigned to c

, . . . , c

by the optimal so-

lution. So, we can delete these points. Now we can show

that among the remaining points, the size of P

i+1

is sig-

niﬁcant. Therefore, we can use random sampling to ob-

tain a center c

i+1

which is a pretty good estimate of c

i+1

Of course we do not know the value of t, and so a naive im-

plementation of this idea gives an O(n(log n)

) time

algorithm.

Algorithm k-means(P, k, ε)

Inputs : Point set P ,

Number of clusters k,

Approximation ratio ε.

Output : k-means clustering of P .

1. For i = 1 to k do

Obtain the clustering

Irred-k-means(P, i, i, φ, ε/64k, 0).

2. Return the clustering which has minimum cost.

Figure 1. The k-means Algorithm

So far the algorithm looks very similar to the k = 2

case. But now we want to modify it to a linear time algo-

rithm. This is where the algorithm gets more involved. As

mentioned above, we can not guess the parameter t. So we

try to guess the size of the point set obtained by removing

the balls of radius t/4 around {c

, . . . , c

}. So we work with

the remaining point set with the hope that the time taken for

this remaining point set will also be small and so the over-

all time will be linear. Although similar in spirit to the k = 2

case, we still need to prove some more details in this case.

Now, we describe the actual k-means algorithm.

4.2. The algorithm

The algorithm is described in Figures 1 and 2. Figure 1

is the main algorithm. The inputs are the point set P , k and

an approximation factor ε. Let α denote ε/64k. The algo-

rithm k-means(P , k, ε) tries to ﬁnd the highest i such that

P is (i, α)-irreducible. Essentially we are saying that it is

enough to ﬁnd i centers only. Since we do not know this

value of i, the algorithm tries all possible values of i.

We now describe the algorithm Irred-k-

means(Q, m, k, C, α, Sum). We have found a set C

of k −m centers already. The points in P − Q have been as-

signed to C. We need to assign the remaining points in Q.

The case m = 0 is clear. In step 2, we try to ﬁnd a new cen-

ter by the random sampling method. This will work

provided a good fraction of the points in Q do not get as-

signed to C. If this is not the case then in step 3, we

assign half of the points in Q to C and call the algo-

rithm recursively with this reduced point set. For the base

case, when |C| = 0, as P

is the largest cluster, we re-

quire to sample only O(k/α) points. This is tackled in Step

2. Step 3 is not performed in this case, as there are no cen-

ters.

A simple linear time (1 + /spl epsiv/)-approximation algorithm for k-means clustering in any dimensions

Citations

SLIC Superpixels Compared to State-of-the-Art Superpixel Methods

k-means++: the advantages of careful seeding

Scalable k-means++

Fast approximate spectral clustering

The Planar k-Means Problem is NP-Hard

References

Pattern Classification

Indexing by Latent Semantic Analysis

Color indexing

Pattern Classification (2nd Edition)

Syntactic clustering of the Web

Related Papers (5)

Least squares quantization in PCM

A local search approximation algorithm for k-means clustering

k-means++: the advantages of careful seeding

On coresets for k-means and k-median clustering

Data clustering: a review

Frequently Asked Questions (12)

Q1. What have the authors contributed in "A simple linear time (1 + ε)-approximation algorithm for k-means clustering in any dimensions" ?

Q2. What have the authors stated for future works in "A simple linear time (1 + ε)-approximation algorithm for k-means clustering in any dimensions" ?

Q3. What is the widely studied problem in computer science?

Q4. What are some of the applications of clustering?

Q5. How can the authors approximate the cost of the optimal k-means clustering?

Q6. What is the problem to get a polynomial time?

Q7. What is the description of the kmeans problem?

Q8. Why is it necessary to compute the 2-means solution in iteration i?

Q9. What is the way to get a good approximation to the optimal 1-me?

Q10. What is the common definition of clustering?

Q11. What is the cost of assigning points to the centers in C?

Q12. What is the kmeans cost of a set of k points?