A density-based algorithm for discovering clusters in large spatial Databases with Noise
Citations
20,196 citations
9,627 citations
6,601 citations
5,744 citations
Cites background from "A density-based algorithm for disco..."
...3) Many novel algorithms have been developed to cluster large-scale data sets, especially in the context of data mining [44], [45], [85], [135], [213], [248]....
[...]
...DBSCAN requires that the density in a neighborhood for an object should be high enough if it belongs to a cluster....
[...]
...Clustering Algorithms • A. Distance and Similarity Measures (See also Table I) • B. Hierarchical — Agglomerative Single linkage, complete linkage, group average linkage, median linkage, centroid linkage, Ward’s method, balanced iterative reducing and clustering using hierarchies (BIRCH), clustering using representatives (CURE), robust clustering using links (ROCK) — Divisive divisive analysis (DIANA), monothetic analysis (MONA) • C. Squared Error-Based (Vector Quantization) — -means, iterative self-organizing data analysis technique (ISODATA), genetic -means algorithm (GKA), partitioning around medoids (PAM) • D. pdf Estimation via Mixture Densities — Gaussian mixture density decomposition (GMDD), AutoClass • E. Graph Theory-Based — Chameleon, Delaunay triangulation graph (DTG), highly connected subgraphs (HCS), clustering iden- tification via connectivity kernels (CLICK), cluster affinity search technique (CAST) • F. Combinatorial Search Techniques-Based — Genetically guided algorithm (GGA), TS clustering, SA clustering • G. Fuzzy — Fuzzy -means (FCM), mountain method (MM), possibilistic -means clustering algorithm (PCM), fuzzy -shells (FCS) • H. Neural Networks-Based — Learning vector quantization (LVQ), self-organizing feature map (SOFM), ART, simplified ART (SART), hyperellipsoidal clustering network (HEC), self-splitting competitive learning network (SPLL) • I. Kernel-Based — Kernel -means, support vector clustering (SVC) • J. Sequential Data — Sequence Similarity — Indirect sequence clustering — Statistical sequence clustering • K. Large-Scale Data Sets (See also Table II) — CLARA, CURE, CLARANS, BIRCH, DBSCAN, DENCLUE, WaveCluster, FC, ART • L. Data visualization and High-dimensional Data — PCA, ICA, Projection pursuit, Isomap, LLE, CLIQUE, OptiGrid, ORCLUS • M....
[...]
...BIRCH was generalized into a broader framework in [101] with two algorithms realization, named as BUBBLE and BUBBLE-FM. d) Density-based approach, e.g., density based spatial clustering of applications with noise (DBSCAN) [85] and density-based clustering (DENCLUE) [135]....
[...]
...DBSCAN uses a -tree structure for more efficient queries....
[...]
5,506 citations
Cites background from "A density-based algorithm for disco..."
...Point clouds on the t-SNEmap represent candidate cell types; density clustering (Ester et al., 1996) identified these regions....
[...]
References
10,537 citations
Additional excerpts
...For each of the discovered clusterings the silhouette coefficient (Kaufman & Rousseeuw 1990) is calculated, and finally, the clustering with the maximum silhouette coefficient is chosen as the “natural” clustering....
[...]
...Clustering Algorithms There are two basic types of clustering algorithms (Kaufman & Rousseeuw 1990): partitioning and hierarchical algorithms....
[...]
...Kaufman L., and Rousseeuw P.J. 1990....
[...]
4,686 citations
"A density-based algorithm for disco..." refers background or methods in this paper
...Brinkhoff T., Kriegel H.-P., Schneider R., and Seeger B. 1994 Efficient Multi-Step Processing of Spatial Joins, Proc....
[...]
...clusters found by a partitioning algorithm is convex which is moderate values for n, but it is prohibitive for applications on very restrictive. large databases. Ng & Han (1994) explore partitioning algorithms for KDD in spatial databases....
[...]
...Unfortunately, the run time of this approach is prohibitive for large n, because it implies O(n) calls of CLARANS. Jain (1988) explores a density based approach to identify clusters in k-dimensional point sets....
[...]
...CLAIWNS assumes that all objects to be clustered can reside in main memory at the same time which does not hold for large databases. Furthermore, the run time of CLARANS is prohibitive on large databases. Therefore, Ester, Kriegel &Xu (1995) present several focusing techniques which address both of these problems by focusing the clustering process on the relevant parts of the database....
[...]
...clusters found by a partitioning algorithm is convex which is moderate values for n, but it is prohibitive for applications on very restrictive. large databases. Ng & Han (1994) explore partitioning algorithms for KDD in spatial databases. An algorithm called CLARANS (Clustering Large Applications based on RANdomized Search) is introduced which is an improved k-medoid method. Compared to former k-medoid algorithms, CLARANS is more effective and more efficient. An experimental evaluation indicates that CLARANS runs efficiently on databases of thousands of objects. Ng &Han (1994) also discuss methods to determine the “natural” number k,, of clusters in a database....
[...]
1,999 citations
"A density-based algorithm for disco..." refers background in this paper
... Ng &Han (1994) also discuss methods to determine the “natural” number k,, of clusters in a database....
[...]
...Ng & Han (1994) explore partitioning algorithms for...
[...]
744 citations
"A density-based algorithm for disco..." refers background in this paper
...Spatial Database Systems (SDBS) (Gueting 1994) are database systems for the management of spatial data....
[...]