A Survey of Outlier Detection Methodologies

doi:10.1023/B:AIRE.0000045502.10941.A9

Open AccessJournal ArticleDOI

A Survey of Outlier Detection Methodologies

Victoria J. Hodge, +1 more

- 01 Oct 2004 -

Artificial Intelligence Review

- Vol. 22, Iss: 2, pp 85-126

TLDR

A survey of contemporary techniques for outlier detection is introduced and their respective motivations are identified and distinguish their advantages and disadvantages in a comparative review.

Abstract:

Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Anomaly detection: A survey

Varun Chandola, +2 more

- 30 Jul 2009 -

ACM Computing Surveys

TL;DR: This survey tries to provide a structured and comprehensive overview of the research on anomaly detection by grouping existing techniques into different categories based on the underlying approach adopted by each technique.

...read moreread less

Book

Introduction to Machine Learning

Ethem Alpaydin

TL;DR: Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts, and discusses many methods from different fields, including statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining.

...read moreread less

Journal Article

Supervised Machine Learning: A Review of Classification Techniques

Sotiris Kotsiantis

- 01 Jan 2007 -

Informatica (lithuanian Academy of Scien...

TL;DR: The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features, and the resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown.

...read moreread less

Journal ArticleDOI

Classification in the Presence of Label Noise: A Survey

Benoît Frénay, +1 more

- 01 May 2014 -

IEEE Transactions on Neural Networks

TL;DR: In this paper, label noise consists of mislabeled instances: no additional information is assumed to be available like e.g., confidences on labels.

...read moreread less

Journal ArticleDOI

Review: A review of novelty detection

Marco A. F. Pimentel, +3 more

- 01 Jun 2014 -

Signal Processing

TL;DR: This review aims to provide an updated and structured investigation of novelty detection research papers that have appeared in the machine learning literature during the last decade.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Classification and Regression Trees.

John Van Ryzin, +4 more

- 01 Mar 1986 -

Journal of the American Statistical Asso...

Book

C4.5: Programs for Machine Learning

J. Ross Quinlan

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.

...read moreread less

Book

Neural networks for pattern recognition

Christopher M. Bishop

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Journal ArticleDOI

Induction of Decision Trees

J. R. Quinlan

- 25 Mar 1986 -

Machine Learning

TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.

...read moreread less

Proceedings Article

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

Martin Ester, +3 more

TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.

...read moreread less

Collapse

A Survey of Outlier Detection Methodologies

Citations

Anomaly detection: A survey

Introduction to Machine Learning

Supervised Machine Learning: A Review of Classification Techniques

Classification in the Presence of Label Noise: A Survey

Review: A review of novelty detection

References

Classification and Regression Trees.

C4.5: Programs for Machine Learning

Neural networks for pattern recognition

Induction of Decision Trees

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

Related Papers (5)

Anomaly detection: A survey

LOF: identifying density-based local outliers

Estimating the Support of a High-Dimensional Distribution

Outliers in Statistical Data

Procedures for Detecting Outlying Observations in Samples