A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

Open AccessProceedings Article

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

- pp 226-231

TLDR

In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.

Abstract:

Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN requires only one input parameter and supports the user in determining an appropriate value for it. We performed an experimental evaluation of the effectiveness and efficiency of DBSCAN using synthetic data and real data of the SEQUOIA 2000 benchmark. The results of our experiments demonstrate that (1) DBSCAN is significantly more effective in discovering clusters of arbitrary shape than the well-known algorithm CLAR-ANS, and that (2) DBSCAN outperforms CLARANS by a factor of more than 100 in terms of efficiency.

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

Citations

Data Mining: Practical Machine Learning Tools and Techniques

Anomaly detection: A survey

Data clustering: 50 years beyond K-means

Survey of clustering algorithms

LOF: identifying density-based local outliers

References

Finding Groups in Data: An Introduction to Cluster Analysis

Algorithms for clustering data

Finding Groups in Data

The R*-tree: an efficient and robust access method for points and rectangles

Efficient and Effective Clustering Methods for Spatial Data Mining

Related Papers (5)

A density-based algorithm for discovering clusters in large spatial Databases with Noise

Some methods for classification and analysis of multivariate observations

Data Mining: Concepts and Techniques

Finding Groups in Data: An Introduction to Cluster Analysis

Data clustering: a review