Search or ask a question

Showing papers by "Ulrike von Luxburg published in 2008"

PDF

Open Access

Journal Article•DOI•

Consistency of spectral clustering

[...]

Ulrike von Luxburg, Mikhail Belkin, Olivier Bousquet

04 Apr 2008-arXiv: Statistics Theory

TL;DR: In this paper, the authors investigated consistency of the popular family of spectral clustering algorithms, which clusters the data with the help of eigenvectors of graph Laplacian matrices.

...read moreread less

Abstract: Consistency is a key property of all statistical procedures analyzing randomly sampled data. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of the popular family of spectral clustering algorithms, which clusters the data with the help of eigenvectors of graph Laplacian matrices. We develop new methods to establish that, for increasing sample size, those eigenvectors converge to the eigenvectors of certain limit operators. As a result, we can prove that one of the two major classes of spectral clustering (normalized clustering) converges under very general conditions, while the other (unnormalized clustering) is only consistent under strong additional assumptions, which are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering.

...read moreread less

204 citations

Proceedings Article•

Influence of graph construction on graph-based clustering measures

[...]

Markus Maier¹, Ulrike von Luxburg¹, Matthias Hein²•Institutions (2)

Max Planck Society¹, Saarland University²

08 Dec 2008

TL;DR: This paper studies the convergence of graph clustering criteria such as the normalized cut (Ncut) as the sample size tends to infinity and finds that the limit expressions are different for different types of graph, for example the r-neighborhood graph or the k-nearest neighbor graph.

...read moreread less

Abstract: Graph clustering methods such as spectral clustering are defined for general weighted graphs. In machine learning, however, data often is not given in form of a graph, but in terms of similarity (or distance) values between points. In this case, first a neighborhood graph is constructed using the similarities between the points and then a graph clustering algorithm is applied to this graph. In this paper we investigate the influence of the construction of the similarity graph on the clustering results. We first study the convergence of graph clustering criteria such as the normalized cut (Ncut) as the sample size tends to infinity. We find that the limit expressions are different for different types of graph, for example the r-neighborhood graph or the k-nearest neighbor graph. In plain words: Ncut on a kNN graph does something systematically different than Ncut on an r-neighborhood graph! This finding shows that graph clustering criteria cannot be studied independently of the kind of graph they are applied to. We also provide examples which show that these differences can be observed for toy and real data already for rather small sample sizes.

...read moreread less

173 citations

Posted Content•

Statistical Learning Theory: Models, Concepts, and Results

[...]

Ulrike von Luxburg, Bernhard Schoelkopf

27 Oct 2008-arXiv: Machine Learning

TL;DR: This chapter provides an overview of the key ideas and insights of statistical learning theory and describes some other variants of machine learning.

...read moreread less

Abstract: Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms. In this article we attempt to give a gentle, non-technical overview over the key ideas and insights of statistical learning theory. We target at a broad audience, not necessarily machine learning researchers. This paper can serve as a starting point for people who want to get an overview on the field before diving into technical details.

...read moreread less

7 citations