scispace - formally typeset
Proceedings ArticleDOI

Scalable clustering and applications

Reads0
Chats0
TLDR
A two step algorithm for spectral clustering to reduce the time complexity toO(nmk + m2k'), by combining both Nyström and Lanczos method, shows very good results, with various data sets, image segmentation problems and churn prediction of a telecommunication data set, even with very low sampling.
Abstract
Large scale machine learning is becoming an active research area recently. Most of the existing clustering algorithms cannot handle big data due to its high time and space complexity. Among the clustering algorithms, eigen vector based clustering, such as Spectral clustering, shows very good accuracy, but it has cubic time complexity. There are various methods proposed to reduce the time and space complexity for eigen decomposition such as Nystrom method, Lanc-zos method etc. Nystrom method has linear time complexity in terms of number of data points, but has cubic time complexity in terms of number of sampling points. To reduce this, various Rank k approximation methods also proposed, but which are less efficient compare to the normalized spectral clustering. In this paper we propose a two step algorithm for spectral clustering to reduce the time complexity toO(nmk + m2k'), by combining both Nystrom and Lanczos method, where k is the number of clusters and k' is the rank k approximation of the sampling matrix (k

read more

Citations
More filters

IEEE transactions on pattern analysis and machine intelligence

Ieee Xplore
TL;DR: This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis and addressing interesting real-world computer Vision and multimedia applications.
References
More filters
Journal ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Proceedings ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Journal ArticleDOI

A tutorial on spectral clustering

TL;DR: In this article, the authors present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches, and discuss the advantages and disadvantages of these algorithms.
Proceedings Article

On Spectral Clustering: Analysis and an algorithm

TL;DR: A simple spectral clustering algorithm that can be implemented using a few lines of Matlab is presented, and tools from matrix perturbation theory are used to analyze the algorithm, and give conditions under which it can be expected to do well.
Proceedings ArticleDOI

A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics

TL;DR: In this paper, the authors present a database containing ground truth segmentations produced by humans for images of a wide variety of natural scenes, and define an error measure which quantifies the consistency between segmentations of differing granularities.
Related Papers (5)