Multidimensional Data Mining for Anomaly Extraction

doi:10.1109/ICACC.2013.8

Home
/
Papers
/
Multidimensional Data Mining for Anomaly Extraction

Proceedings Article•DOI•

Multidimensional Data Mining for Anomaly Extraction

29 Aug 2013-pp 5-8

TL;DR: By applying multidimensional mining rule to extract anomaly, the technique effectively finds the flows associated with the anomalous events and can reduce the work-hours needed for analyzing alarms and making anomaly systems more effective.

read less

Abstract: Due to heavy traffic the network monitoring is very difficult and cumbersome job, hence the probability of network attacks increases substantially. So there is the need of extraction anomalies. Anomaly extraction means to find flows associated with the anomalous events, in a large set of flows observed during an anomalous time interval. Anomaly extraction is very important for root-cause analysis, network forensics, attack mitigation and anomaly modeling. To identify the suspicious flows, we use meta-data provided by several histogram based detectors and then apply association rule with multidimensional mining concept to find and summarize anomalous flows. By taking rich traffic data from a backbone network, we show that our technique effectively finds the flows associated with the anomalous events. So by applying multidimensional mining rule to extract anomaly, we can reduce the work-hours needed for analyzing alarms and making anomaly systems more effective.

...read moreread less

Citations

PDF

Open Access

More filters

Dissertation•

A study of multivariate behavior and anomaly patterns : tensor decomposition for multiway big data

[...]

Alina Rakhi Ajayan

01 Jan 2017

TL;DR: Soft Clustering Data Normalization PreProcessing Stage 1 DGDS Big Data set from the SGSC Project is compared to real-time data sets using the Hadoop 2.0 architecture.

...read moreread less

Abstract: ion: Soft Clustering Data Normalization PreProcessing Stage 1 DGDS Big Data set from the SGSC Project

...read moreread less

4 citations

Cites background from "Multidimensional Data Mining for An..."

...Experiments have inferred how chunks of data can be given a ‘birds-eye-view’ to obtain large scale trend information, bypassing computations involving every single reading [103, 104]....
[...]

Book Chapter•DOI•

Tensor Decompositions in Multimodal Big Data: Studying Multiway Behavioral Patterns

[...]

Alina Rakhi Ajayan¹, Firas Al-Doghman¹, Zenon Chaczko¹•Institutions (1)

University of Technology, Sydney¹

24 May 2017

TL;DR: How behavior patterns and related anomalies comprehensively define a CPS is demonstrated to capture the complex knowledge encompassed in these data flows.

...read moreread less

Abstract: Preset day cyber-physical systems (CPS) are the confluence of very large data sets, tight time constraints, and heterogeneous hardware units, ridden with latency and volume constraints, demanding newer analytic perspectives. Their system logistics can be well-defined by the data-streams’ behavioral trends across various modalities, without numerical restrictions, favoring resource-saving over methods of investigating individual component features and operations. The aim of this paper is to demonstrate how behavior patterns and related anomalies comprehensively define a CPS. Tensor decompositions are hypothesized as the solution in the context of multimodal smart-grid-originated Big Data analysis. Tensorial data representation is demonstrated to capture the complex knowledge encompassed in these data flows. The uniqueness of this approach is highlighted in the modified multiway anomaly patterns models. In addition, higher-order data preparation schemes, design and implementation of tensorial frameworks and experimental-analysis are final outcomes.

...read moreread less

Proceedings Article•DOI•

Visualizing Multimodal Big Data Anomaly Patterns in Higher-Order Feature Spaces

[...]

Alina Rakhi Ajayan¹, Firas Al-Doghman¹, Zenon Chaczko¹•Institutions (1)

University of Technology, Sydney¹

01 Dec 2018

TL;DR: Investigating the applicability of an arithmetic tool Tensor Decompositions and Factorizations in this scenario proved that Abnormal patterns detected in decomposed Tensor factors encompass deep information energy content from Big Data as efficiently as other Pattern Extraction and Knowledge Discovery frameworks, while salvaging time and resources.

...read moreread less

Abstract: The world today, as we know it, is profuse with information about humans and objects. Datasets generated by cyber-physical systems are orders of magnitude larger than their current information processing capabilities. Tapping into these big data flows to uncover much deeper perceptions into the functioning, operational logic and smartness levels attainable has been investigated for quite a while. Knowledge Discovery & Representation capabilities across mutiple modalities holds much scope in this direction, with regards to their information holding potential. This paper investigates the applicability of an arithmetic tool Tensor Decompositions and Factorizations in this scenario. Higher order datasets are decomposed for Anomaly Pattern capture which encases intelligence along multiple modes of data flow. Preliminary investigations based on data derived from Smart Grid Smart City Project are compliant with our hypothesis. The results proved that Abnormal patterns detected in decomposed Tensor factors encompass deep information energy content from Big Data as efficiently as other Pattern Extraction and Knowledge Discovery frameworks, while salvaging time and resources.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article•

Fast Algorithms for Mining Association Rules in Large Databases

[...]

Rakesh Agrawal, Ramakrishnan Srikant

12 Sep 1994

10,454 citations

Journal Article•DOI•

Frequent pattern mining: current status and future directions

[...]

Jiawei Han¹, Hong Cheng¹, Dong Xin¹, Xifeng Yan¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Aug 2007-Data Mining and Knowledge Discovery

TL;DR: It is believed that frequent pattern mining research has substantially broadened the scope of data analysis and will have deep impact on data mining methodologies and applications in the long run, however, there are still some challenging research issues that need to be solved before frequent patternmining can claim a cornerstone approach in data mining applications.

...read moreread less

Abstract: Frequent pattern mining has been a focused theme in data mining research for over a decade. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset mining in transaction databases to numerous research frontiers, such as sequential pattern mining, structured pattern mining, correlation mining, associative classification, and frequent pattern-based clustering, as well as their broad applications. In this article, we provide a brief overview of the current status of frequent pattern mining and discuss a few promising research directions. We believe that frequent pattern mining research has substantially broadened the scope of data analysis and will have deep impact on data mining methodologies and applications in the long run. However, there are still some challenging research issues that need to be solved before frequent pattern mining can claim a cornerstone approach in data mining applications.

...read moreread less

1,448 citations

Report•DOI•

Data mining approaches for intrusion detection

[...]

Wenke Lee¹, Salvatore J. Stolfo¹•Institutions (1)

Columbia University¹

26 Jan 1998

TL;DR: An agent-based architecture for intrusion detection systems where the learning agents continuously compute and provide the updated (detection) models to the detection agents is proposed.

...read moreread less

Abstract: In this paper we discuss our research in developing general and systematic methods for intrusion detection. The key ideas are to use data mining techniques to discover consistent and useful patterns of system features that describe program and user behavior, and use the set of relevant system features to compute (inductively learned) classifiers that can recognize anomalies and known intrusions. Using experiments on the sendmail system call data and the network tcpdump data, we demonstrate that we can construct concise and accurate classifiers to detect anomalies. We provide an overview on two general data mining algorithms that we have implemented: the association rules algorithm and the frequent episodes algorithm. These algorithms can be used to compute the intra-and inter-audit record patterns, which are essential in describing program or user behavior. The discovered patterns can guide the audit data gathering process and facilitate feature selection. To meet the challenges of both efficient learning (mining) and real-time detection, we propose an agent-based architecture for intrusion detection systems where the learning agents continuously compute and provide the updated (detection) models to the detection agents.

...read moreread less

1,353 citations

Proceedings Article•DOI•

A signal analysis of network traffic anomalies

[...]

Paul Barford¹, Jeffery Kline¹, David Plonka¹, Amos Ron¹•Institutions (1)

University of Wisconsin-Madison¹

06 Nov 2002

TL;DR: This paper reports results of signal analysis of four classes of network traffic anomalies: outages, flash crowds, attacks and measurement failures, and shows that wavelet filters are quite effective at exposing the details of both ambient and anomalous traffic.

...read moreread less

Abstract: Identifying anomalies rapidly and accurately is critical to the efficient operation of large computer networks. Accurately characterizing important classes of anomalies greatly facilitates their identification; however, the subtleties and complexities of anomalous traffic can easily confound this process. In this paper we report results of signal analysis of four classes of network traffic anomalies: outages, flash crowds, attacks and measurement failures. Data for this study consists of IP flow and SNMP measurements collected over a six month period at the border router of a large university. Our results show that wavelet filters are quite effective at exposing the details of both ambient and anomalous traffic. Specifically, we show that a pseudo-spline filter tuned at specific aggregation levels will expose distinct characteristics of each class of anomaly. We show that an effective way of exposing anomalies is via the detection of a sharp increase in the local variance of the filtered data. We evaluate traffic anomaly signals at different points within a network based on topological distance from the anomaly source or destination. We show that anomalies can be exposed effectively even when aggregated with a large amount of additional traffic. We also compare the difference between the same traffic anomaly signals as seen in SNMP and IP flow data, and show that the more coarse-grained SNMP data can also be used to expose anomalies effectively.

...read moreread less

919 citations

"Multidimensional Data Mining for An..." refers background in this paper

...Compared to these studies, we learn that intelligently combining multidimensional heavy-hitters with anomaly detection enables us to extract anomalous flows....
[...]

Proceedings Article•DOI•

Detecting anomalies in network traffic using maximum entropy estimation

[...]

Yu Gu¹, Andrew McCallum¹, Don Towsley¹•Institutions (1)

University of Massachusetts Amherst¹

19 Oct 2005

TL;DR: In this paper, a behavior-based anomaly detection method that detects network anomalies by comparing the current network traffic against a baseline distribution is proposed, which provides a flexible and fast approach to estimate the baseline distribution.

...read moreread less

Abstract: We develop a behavior-based anomaly detection method that detects network anomalies by comparing the current network traffic against a baseline distribution. The Maximum Entropy technique provides a flexible and fast approach to estimate the baseline distribution, which also gives the network administrator a multi-dimensional view of the network traffic. By computing a measure related to the relative entropy of the network traffic under observation with respect to the baseline distribution, we are able to distinguish anomalies that change the traffic either abruptly or slowly. In addition, our method provides information revealing the type of the anomaly detected. It requires a constant memory and a computation time proportional to the traffic rate.

...read moreread less

379 citations