scispace - formally typeset
Open AccessPosted Content

A Tutorial on Spectral Clustering

TLDR
This tutorial describes different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches.
Abstract
In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.

read more

Citations
More filters
Journal ArticleDOI

Community detection in graphs

TL;DR: A thorough exposition of the main elements of the clustering problem can be found in this paper, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.
Proceedings Article

Spectral Networks and Locally Connected Networks on Graphs

TL;DR: This paper considers possible generalizations of CNNs to signals defined on more general domains without the action of a translation group, and proposes two constructions, one based upon a hierarchical clustering of the domain, and another based on the spectrum of the graph Laplacian.
Journal ArticleDOI

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

TL;DR: This paper employs approximation algorithms for the graph-partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities, and defines the network community profile plot, which characterizes the "best" possible community—according to the conductance measure—over a wide range of size scales.
Posted Content

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

TL;DR: In this article, the authors employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities.
Proceedings ArticleDOI

Statistical properties of community structure in large social and information networks

TL;DR: It is found that a generative model, in which new edges are added via an iterative "forest fire" burning process, is able to produce graphs exhibiting a network community structure similar to that observed in nearly every network dataset examined.
References
More filters
Book

Matrix computations

Gene H. Golub
Journal ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Proceedings Article

On Spectral Clustering: Analysis and an algorithm

TL;DR: A simple spectral clustering algorithm that can be implemented using a few lines of Matlab is presented, and tools from matrix perturbation theory are used to analyze the algorithm, and give conditions under which it can be expected to do well.
Journal ArticleDOI

Laplacian Eigenmaps for dimensionality reduction and data representation

TL;DR: In this article, the authors proposed a geometrically motivated algorithm for representing high-dimensional data, based on the correspondence between the graph Laplacian, the Laplace Beltrami operator on the manifold and the connections to the heat equation.
Related Papers (5)