scispace - formally typeset

Anomaly detection

About: Anomaly detection is a(n) research topic. Over the lifetime, 24275 publication(s) have been published within this topic receiving 407896 citation(s). The topic is also known as: outlier detection & novelty detection. more


Journal ArticleDOI: 10.1145/1541880.1541882
Abstract: Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and more succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the different directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with. more

Topics: Anomaly detection (56%), Local outlier factor (52%)

7,894 Citations

Journal ArticleDOI: 10.1145/335191.335388
16 May 2000-
Abstract: For many KDD applications, such as detecting criminal activities in E-commerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier. This degree is called the local outlier factor (LOF) of an object. It is local in that the degree depends on how isolated the object is with respect to the surrounding neighborhood. We give a detailed formal analysis showing that LOF enjoys many desirable properties. Using real-world datasets, we demonstrate that LOF can be used to find outliers which appear to be meaningful, but can otherwise not be identified with existing approaches. Finally, a careful performance evaluation of our algorithm confirms we show that our approach of finding local outliers can be practical. more

Topics: Local outlier factor (64%), Outlier (58%), Anomaly detection (52%)

4,117 Citations

Open accessJournal ArticleDOI: 10.1023/B:AIRE.0000045502.10941.A9
Victoria J. Hodge1, Jim Austin1Institutions (1)
Abstract: Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review. more

Topics: Outlier (62%), Anomaly detection (58%)

2,897 Citations

Open accessJournal ArticleDOI: 10.1023/B:MACH.0000008084.60811.49
David M. J. Tax1, Robert P. W. Duin1Institutions (1)
01 Jan 2004-Machine Learning
Abstract: Data domain description concerns the characterization of a data set. A good description covers all target data but includes no superfluous space. The boundary of a dataset can be used to detect novel data or outliers. We will present the Support Vector Data Description (SVDD) which is inspired by the Support Vector Classifier. It obtains a spherically shaped boundary around a dataset and analogous to the Support Vector Classifier it can be made flexible by using other kernel functions. The method is made robust against outliers in the training set and is capable of tightening the description by using negative examples. We show characteristics of the Support Vector Data Descriptions using artificial and real data. more

Topics: Margin classifier (62%), Structured support vector machine (61%), Kernel method (60%) more

2,431 Citations

Proceedings ArticleDOI: 10.1109/CISDA.2009.5356528
08 Jul 2009-
Abstract: During the last decade, anomaly detection has attracted the attention of many researchers to overcome the weakness of signature-based IDSs in detecting novel attacks, and KDDCUP'99 is the mostly widely used data set for the evaluation of these systems. Having conducted a statistical analysis on this data set, we found two important issues which highly affects the performance of evaluated systems, and results in a very poor evaluation of anomaly detection approaches. To solve these issues, we have proposed a new data set, NSL-KDD, which consists of selected records of the complete KDD data set and does not suffer from any of mentioned shortcomings. more

Topics: Anomaly detection (54%), Data set (52%)

2,387 Citations

No. of papers in the topic in previous years

Top Attributes

Show by:

Topic's top 5 most impactful authors

Christopher Leckie

55 papers, 2.1K citations

Christos Faloutsos

42 papers, 2.4K citations

Fabrizio Angiulli

29 papers, 818 citations

Salvatore J. Stolfo

22 papers, 2.3K citations

Christian Callegari

20 papers, 223 citations

Network Information
Related Topics (5)
Feature selection

41.4K papers, 1M citations

92% related
Support vector machine

73.6K papers, 1.7M citations

92% related
Dimensionality reduction

21.9K papers, 579.2K citations

92% related
Feature vector

48.8K papers, 954.4K citations

91% related
Supervised learning

20.8K papers, 710.5K citations

91% related