scispace - formally typeset
Journal ArticleDOI

Fast density peak clustering for large scale data based on kNN

TLDR
A simple but fast DPeak, namely FastDPeak, 1  is proposed, which runs in about O ( n l o g ( n ) ) expected time in the intrinsic dimensionality and replaces density with kNN-density, which is computed by fast kNN algorithm such as cover tree, yielding huge improvement for density computations.
Abstract
Density Peak (DPeak) clustering algorithm is not applicable for large scale data, due to two quantities, i.e, ρ and δ , are both obtained by brute force algorithm with complexity O ( n 2 ) . Thus, a simple but fast DPeak, namely FastDPeak, 1  is proposed, which runs in about O ( n l o g ( n ) ) expected time in the intrinsic dimensionality. It replaces density with kNN-density, which is computed by fast kNN algorithm such as cover tree, yielding huge improvement for density computations. Based on kNN-density, local density peaks and non-local density peaks are identified, and a fast algorithm, which uses two different strategies to compute δ for them, is also proposed with complexity O ( n ) . Experimental results show that FastDPeak is effective and outperforms other variants of DPeak.

read more

Citations
More filters
Journal ArticleDOI

Big Data Cleaning Based on Mobile Edge Computing in Industrial Sensor-Cloud

TL;DR: Experimental results show that multidimensional data cleaning based on mobile edge nodes improves the efficiency of data cleaning while maintaining data reliability and integrity, and greatly reduces the bandwidth and energy consumption of the industrial SCS.
Journal ArticleDOI

KNN-BLOCK DBSCAN: Fast Clustering for Large-Scale Data

TL;DR: A simple but fast approximate DBSCAN is proposed based on two findings: 1) the problem of identifying whether a point is a core point or not is, in fact, a kNN problem and 2) a point has a similar density distribution to its neighbors, and neighbor points are highly possible to be the same type (core point, border point, or noise).
Journal ArticleDOI

BLOCK-DBSCAN: Fast clustering for large scale data

TL;DR: An approximate approach, namely BLOCK-DBSCAN, is proposed for large scale data, which runs in about O(nlog (n) expected time and obtains almost the same result as DBSCAN.
Journal ArticleDOI

A five-layer deep convolutional neural network with stochastic pooling for chest CT-based COVID-19 diagnosis

TL;DR: A novel deep learning model that can diagnose COVID-19 on chest CT more accurately and swiftly is proposed and using stochastic pooling yields better performance than average pooling and max pooling.
Journal ArticleDOI

A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

TL;DR: An approach has been proposed to improve the pruning phase of the LC-KNN method by taking into account factors that help to choose a more appropriate cluster of data for looking for the neighbors, thus, increasing the classification accuracy.
References
More filters
Proceedings Article

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.
Proceedings Article

A density-based algorithm for discovering clusters in large spatial Databases with Noise

TL;DR: DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.
Journal ArticleDOI

Data clustering: a review

TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.
Journal ArticleDOI

Mean shift: a robust approach toward feature space analysis

TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Journal ArticleDOI

A tutorial on spectral clustering

TL;DR: In this article, the authors present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches, and discuss the advantages and disadvantages of these algorithms.
Related Papers (5)