Discovering Anomalies by Incorporating Feedback from an Expert

doi:10.1145/3396608

Journal ArticleDOI

Discovering Anomalies by Incorporating Feedback from an Expert

Shubhomoy Das, +4 more

- 22 Jun 2020 -

ACM Transactions on Knowledge Discovery ...

- Vol. 14, Iss: 4, pp 1-32

Chats0

TLDR

The Active Anomaly Discovery (AAD) algorithm is described, which incorporates feedback from an expert user that labels a queried data instance as an anomaly or nominal point and approximations are presented that make the AAD algorithm much more computationally efficient while maintaining a desirable level of performance.

Abstract:

Unsupervised anomaly detection algorithms search for outliers and then predict that these outliers are the anomalies. When deployed, however, these algorithms are often criticized for high false-positive and high false-negative rates. One main cause of poor performance is that not all outliers are anomalies and not all anomalies are outliers. In this article, we describe the Active Anomaly Discovery (AAD) algorithm, which incorporates feedback from an expert user that labels a queried data instance as an anomaly or nominal point. This feedback is intended to adjust the anomaly detector so that the outliers it discovers are more in tune with the expert user’s semantic understanding of the anomalies. The AAD algorithm is based on a weighted ensemble of anomaly detectors. When it receives a label from the user, it adjusts the weights on each individual ensemble member such that the anomalies rank higher in terms of their anomaly score than the outliers. The AAD approach is designed to operate in an interactive data exploration loop. In each iteration of this loop, our algorithm first selects a data instance to present to the expert as a potential anomaly and then the expert labels the instance as an anomaly or as a nominal data point. When it receives the instance label, the algorithm updates its internal model and the loop continues until a budget of B queries is spent. The goal of our approach is to maximize the total number of true anomalies in the B instances presented to the expert. We show that the AAD method performs well and in some cases doubles the number of true anomalies found compared to previous methods. In addition we present approximations that make the AAD algorithm much more computationally efficient while maintaining a desirable level of performance.

Discovering Anomalies by Incorporating Feedback from an Expert

Citations

A Unifying Review of Deep and Shallow Anomaly Detection

A Unifying Review of Deep and Shallow Anomaly Detection

Improving Out-of-Distribution Detection by Learning From the Deployment Environment

Anomaly Detection by Leveraging Incomplete Anomalous Knowledge with Anomaly-Aware Bidirectional GANs

Active-MTSAD: Multivariate Time Series Anomaly Detection With Active Learning

References

Optimizing search engines using clickthrough data

Isolation Forest

Query by committee

A new polynomial-time algorithm for linear programming

An Analysis of Active Learning Strategies for Sequence Labeling Tasks

Related Papers (5)

Incorporating Expert Feedback into Active Anomaly Discovery

Incorporating Feedback into Tree-based Anomaly Detection

Anomaly detection method using center offset measurement based on leverage principle

Iterative Anomaly Detection Algorithm Based on Time Series Analysis

Deep Anomaly Detection with Deviation Networks