Arrays of (locality-sensitive) Count Estimators (ACE): Anomaly Detection on the Edge

doi:10.1145/3178876.3186056

Open AccessProceedings ArticleDOI

Arrays of (locality-sensitive) Count Estimators (ACE): Anomaly Detection on the Edge

Chen Luo, +1 more

- pp 1439-1448

Chats0

TLDR

This paper proposes ACE (Arrays of (locality-sensitive) Count Estimators) algorithm that can be 60x faster than most state-of-the-art unsupervised anomaly detection algorithms and has appealing privacy properties.

Abstract:

Anomaly detection is one of the frequent and important subroutines deployed in large-scale data processing applications. Even being a well-studied topic, existing techniques for unsupervised anomaly detection require storing significant amounts of data, which is prohibitive from memory, latency and privacy perspectives, especially for small mobile devices which has ultra-low memory budget and limited computational power. In this paper, we propose ACE (Arrays of (locality-sensitive) Count Estimators) algorithm that can be 60x faster than most state-of-the-art unsupervised anomaly detection algorithms. In addition, ACE has appealing privacy properties. Our experiments show that ACE algorithm has significantly smaller memory footprints (∠ 4MB in our experiments) which can exploit Level 3 cache of any modern processor. At the core of the ACE algorithm, there is a novel statistical estimator which is derived from the sampling view of Locality Sensitive Hashing (LSH). This view is significantly different and efficient than the widely popular view of LSH for near-neighbor search. We show the superiority of ACE algorithm over 11 popular baselines on 3 benchmark datasets, including the KDD-Cup99 data which is the largest available public benchmark comprising of more than half a million entries with ground truth anomaly labels.

Citations

PDF

Open Access

More filters

Journal Article

Ranking outliers using symmetric neighborhood relationship

Wen Jin, +3 more

- 01 Jan 2006 -

Lecture Notes in Computer Science

TL;DR: In this article, the authors proposed a measure on local outliers based on a symmetric neighborhood relationship, which considers both neighbors and reverse neighbors of an object when estimating its density distribution.

...read moreread less

Journal ArticleDOI

A neural data structure for novelty detection.

Sanjoy Dasgupta, +3 more

- 18 Dec 2018 -

Proceedings of the National Academy of S...

TL;DR: This work found that the fruit fly olfactory circuit evolved a variant of a Bloom filter to assess the novelty of odors, and develops a class of distance- and time-sensitive Bloom filters that outperform prior filters when evaluated on several biological and computational datasets.

...read moreread less

Proceedings Article

Space and Time Efficient Kernel Density Estimation in High Dimensions

Arturs Backurs, +2 more

TL;DR: This work instantiate their framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property, and presents an improvement to their framework that retains the same query time, while requiring only linear space and linear preprocessing time.

...read moreread less

Journal ArticleDOI

Review and State of Art of Fog Computing

Asif Ali Laghari, +2 more

- 01 Aug 2021 -

Archives of Computational Methods in Eng...

TL;DR: This paper describes fog computing technology, infrastructure, and applications, and presents the latest development of fog networking, quality of experience, cloud at the edge, platforms, security, and privacy.

...read moreread less

Proceedings Article

Rehashing Kernel Evaluation in High Dimensions

Paris Siminelakis, +4 more

TL;DR: This paper proposes and implements provable and practical procedures for adaptive sample size selection, preprocessing time reduction, and refined variance bounds that quantify the datadependent performance of random sampling and hashing-based kernel evaluation methods.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Anomaly detection: A survey

Varun Chandola, +2 more

- 30 Jul 2009 -

ACM Computing Surveys

TL;DR: This survey tries to provide a structured and comprehensive overview of the research on anomaly detection by grouping existing techniques into different categories based on the underlying approach adopted by each technique.

...read moreread less

Journal ArticleDOI

LOF: identifying density-based local outliers

Markus M. Breunig, +3 more

TL;DR: This paper contends that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier, called the local outlier factor (LOF), and gives a detailed formal analysis showing that LOF enjoys many desirable properties.

...read moreread less

Journal ArticleDOI

Top 10 algorithms in data mining

Xindong Wu, +13 more

- 19 Dec 2007 -

Knowledge and Information Systems

TL;DR: This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART.

...read moreread less

Proceedings ArticleDOI

Approximate nearest neighbors: towards removing the curse of dimensionality

Piotr Indyk, +1 more

TL;DR: In this paper, the authors present two algorithms for the approximate nearest neighbor problem in high-dimensional spaces, for data sets of size n living in R d, which require space that is only polynomial in n and d.

...read moreread less

Proceedings Article

Similarity Search in High Dimensions via Hashing

Aristides Gionis, +2 more

TL;DR: Experimental results indicate that the novel scheme for approximate similarity search based on hashing scales well even for a relatively large number of dimensions, and provides experimental evidence that the method gives improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition.

...read moreread less