scispace - formally typeset
Journal ArticleDOI

Discovering Anomalies by Incorporating Feedback from an Expert

Reads0
Chats0
TLDR
The Active Anomaly Discovery (AAD) algorithm is described, which incorporates feedback from an expert user that labels a queried data instance as an anomaly or nominal point and approximations are presented that make the AAD algorithm much more computationally efficient while maintaining a desirable level of performance.
Abstract
Unsupervised anomaly detection algorithms search for outliers and then predict that these outliers are the anomalies. When deployed, however, these algorithms are often criticized for high false-positive and high false-negative rates. One main cause of poor performance is that not all outliers are anomalies and not all anomalies are outliers. In this article, we describe the Active Anomaly Discovery (AAD) algorithm, which incorporates feedback from an expert user that labels a queried data instance as an anomaly or nominal point. This feedback is intended to adjust the anomaly detector so that the outliers it discovers are more in tune with the expert user’s semantic understanding of the anomalies. The AAD algorithm is based on a weighted ensemble of anomaly detectors. When it receives a label from the user, it adjusts the weights on each individual ensemble member such that the anomalies rank higher in terms of their anomaly score than the outliers. The AAD approach is designed to operate in an interactive data exploration loop. In each iteration of this loop, our algorithm first selects a data instance to present to the expert as a potential anomaly and then the expert labels the instance as an anomaly or as a nominal data point. When it receives the instance label, the algorithm updates its internal model and the loop continues until a budget of B queries is spent. The goal of our approach is to maximize the total number of true anomalies in the B instances presented to the expert. We show that the AAD method performs well and in some cases doubles the number of true anomalies found compared to previous methods. In addition we present approximations that make the AAD algorithm much more computationally efficient while maintaining a desirable level of performance.

read more

Citations
More filters
Journal ArticleDOI

A Unifying Review of Deep and Shallow Anomaly Detection

TL;DR: This review aims to identify the common underlying principles and the assumptions that are often made implicitly by various methods in deep learning, and draws connections between classic “shallow” and novel deep approaches and shows how this relation might cross-fertilize or extend both directions.
Journal ArticleDOI

A Unifying Review of Deep and Shallow Anomaly Detection

TL;DR: Deep learning approaches to anomaly detection (AD) have recently improved the state of the art in detection performance on complex data sets, such as large collections of images or text as mentioned in this paper, and led to the introduction of a great variety of new methods.
Journal ArticleDOI

Improving Out-of-Distribution Detection by Learning From the Deployment Environment

TL;DR: This work introduces advanced DNN training methods to codesign for accuracy and OOD detection in the offline training phase, and proposes a novel “learn-online” workflow for updating the DNNs during deployment using a small library of carefully collected samples from the operating environment.
Proceedings ArticleDOI

Anomaly Detection by Leveraging Incomplete Anomalous Knowledge with Anomaly-Aware Bidirectional GANs

TL;DR: An anomaly-aware generative adversarial network is developed, which, in addition to modeling the normal samples as most GANs do, is able to explicitly avoid assigning probabilities for collected anomalous samples.
Proceedings ArticleDOI

Active-MTSAD: Multivariate Time Series Anomaly Detection With Active Learning

TL;DR: An active anomaly detection framework named Active-MTSAD suitable for multi-dimensional time series, combining unsupervised anomaly detection and active learning is proposed, and three feedback strategies are introduced, namely denominator penalty, negative penalty, and metric learning.
References
More filters
Proceedings ArticleDOI

Optimizing search engines using clickthrough data

TL;DR: The goal of this paper is to develop a method that utilizes clickthrough data for training, namely the query-log of the search engine in connection with the log of links the users clicked on in the presented ranking.
Proceedings ArticleDOI

Isolation Forest

TL;DR: The use of isolation enables the proposed method, iForest, to exploit sub-sampling to an extent that is not feasible in existing methods, creating an algorithm which has a linear time complexity with a low constant and a low memory requirement.
Proceedings ArticleDOI

Query by committee

TL;DR: It is suggested that asymptotically finite information gain may be an important characteristic of good query algorithms, in which a committee of students is trained on the same data set.
Proceedings ArticleDOI

A new polynomial-time algorithm for linear programming

TL;DR: The algorithm consists of repeated application of such projective transformations each followed by optimization over an inscribed sphere to create a sequence of points which converges to the optimal solution in polynomial-time.
Proceedings ArticleDOI

An Analysis of Active Learning Strategies for Sequence Labeling Tasks

TL;DR: This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.
Related Papers (5)