scispace - formally typeset
Topic

CURE data clustering algorithm

About: CURE data clustering algorithm is a(n) research topic. Over the lifetime, 13766 publication(s) have been published within this topic receiving 461296 citation(s).
Papers
More filters

Proceedings Article
02 Aug 1996-
Abstract: Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN requires only one input parameter and supports the user in determining an appropriate value for it. We performed an experimental evaluation of the effectiveness and efficiency of DBSCAN using synthetic data and real data of the SEQUOIA 2000 benchmark. The results of our experiments demonstrate that (1) DBSCAN is significantly more effective in discovering clusters of arbitrary shape than the well-known algorithm CLAR-ANS, and that (2) DBSCAN outperforms CLARANS by a factor of more than 100 in terms of efficiency.

14,552 citations


Proceedings Article
01 Jan 1996-
TL;DR: DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.
Abstract: Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN requires only one input parameter and supports the user in determining an appropriate value for it. We performed an experimental evaluation of the effectiveness and efficiency of DBSCAN using synthetic data and real data of the SEQUOIA 2000 benchmark. The results of our experiments demonstrate that (1) DBSCAN is significantly more effective in discovering clusters of arbitrary shape than the well-known algorithm CLARANS, and that (2) DBSCAN outperforms CLARANS by a factor of more than 100 in terms of efficiency.

14,280 citations



Book
01 Jan 1988-

8,580 citations


Proceedings Article
03 Jan 2001-
TL;DR: A simple spectral clustering algorithm that can be implemented using a few lines of Matlab is presented, and tools from matrix perturbation theory are used to analyze the algorithm, and give conditions under which it can be expected to do well.
Abstract: Despite many empirical successes of spectral clustering methods— algorithms that cluster points using eigenvectors of matrices derived from the data—there are several unresolved issues. First. there are a wide variety of algorithms that use the eigenvectors in slightly different ways. Second, many of these algorithms have no proof that they will actually compute a reasonable clustering. In this paper, we present a simple spectral clustering algorithm that can be implemented using a few lines of Matlab. Using tools from matrix perturbation theory, we analyze the algorithm, and give conditions under which it can be expected to do well. We also show surprisingly good experimental results on a number of challenging clustering problems.

8,315 citations


Network Information
Related Topics (5)
Correlation clustering

19.3K papers, 602.5K citations

94% related
Canopy clustering algorithm

12K papers, 339.4K citations

93% related
Fuzzy clustering

23.2K papers, 601.2K citations

92% related
Association rule learning

15.1K papers, 362K citations

91% related
Single-linkage clustering

6.3K papers, 261.6K citations

91% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20211
20203
201917
2018113
2017678
20161,041

Top Attributes

Show by:

Topic's top 5 most impactful authors

Hans-Peter Kriegel

31 papers, 36.6K citations

Licheng Jiao

26 papers, 492 citations

Witold Pedrycz

25 papers, 1.6K citations

Sanghamitra Bandyopadhyay

24 papers, 1.2K citations

Yasunori Endo

23 papers, 93 citations