scispace - formally typeset
Open AccessProceedings Article

A density-based algorithm for discovering clusters in large spatial Databases with Noise

Reads0
Chats0
TLDR
DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.
Abstract
Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN requires only one input parameter and supports the user in determining an appropriate value for it. We performed an experimental evaluation of the effectiveness and efficiency of DBSCAN using synthetic data and real data of the SEQUOIA 2000 benchmark. The results of our experiments demonstrate that (1) DBSCAN is significantly more effective in discovering clusters of arbitrary shape than the well-known algorithm CLARANS, and that (2) DBSCAN outperforms CLARANS by a factor of more than 100 in terms of efficiency.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Tracking Depression Dynamics in College Students Using Mobile Phone and Wearable Sensing

TL;DR: A new approach to predicting depression using passive sensing data from students' smartphones and wearables is proposed and it is shown that symptom features derived from phone and wearable sensors can predict whether or not a student is depressed on a week by week basis.
Journal ArticleDOI

Detection and tracking of pedestrians and vehicles using roadside LiDAR sensors

TL;DR: Fundamental concepts, solution algorithms, and application guidance associated with using infrastructure-based LiDAR sensors to accurately detect and track pedestrians and vehicles at intersections are explored.
Journal ArticleDOI

MobilityGraphs: Visual Analysis of Mass Mobility Dynamics via Spatio-Temporal Graphs and Clustering

TL;DR: A graph-based method, called MobilityGraphs, is developed, which reveals movement patterns that were occluded in flow maps, and enables the visual representation of the spatio-temporal variation of movements for long time series of spatial situations originally containing a large number of intersecting flows.
Journal ArticleDOI

The (black) art of runtime evaluation: Are we comparing algorithms or implementations?

TL;DR: This work substantiates its points with extensive experiments, using clustering and outlier detection methods with and without index acceleration, and discusses what one can learn from evaluations, whether experiments are properly designed, and what kind of conclusions one should avoid.
Proceedings ArticleDOI

On the Origins of Memes by Means of Fringe Web Communities

TL;DR: In this article, the authors detect and measure the propagation of memes across multiple Web communities, using a processing pipeline based on perceptual hashing and clustering techniques, and a dataset of 160M images from 2.6B posts gathered from Twitter, Reddit, 4chan's Politically Incorrect board (/pol/), and Gab, over the course of 13 months.
References
More filters
Book

Finding Groups in Data: An Introduction to Cluster Analysis

TL;DR: An electrical signal transmission system, applicable to the transmission of signals from trackside hot box detector equipment for railroad locomotives and rolling stock, wherein a basic pulse train is transmitted whereof the pulses are of a selected first amplitude and represent a train axle count.
Proceedings ArticleDOI

The R*-tree: an efficient and robust access method for points and rectangles

TL;DR: The R*-tree is designed which incorporates a combined optimization of area, margin and overlap of each enclosing rectangle in the directory which clearly outperforms the existing R-tree variants.
Proceedings Article

Efficient and Effective Clustering Methods for Spatial Data Mining

TL;DR: The analysis and experiments show that with the assistance of CLAHANS, these two algorithms are very effective and can lead to discoveries that are difficult to find with current spatial data mining algorithms.
Journal ArticleDOI

An introduction to spatial database systems

TL;DR: This work surveys data modeling, querying, data structures and algorithms, and system architecture for spatial database systems, with the emphasis on describing known technology in a coherent manner, rather than listing open problems.