scispace - formally typeset
Journal ArticleDOI

An intelligent PE-malware detection system based on association mining

Reads0
Chats0
TLDR
The Intelligent Malware Detection System (IMDS) is an integrated system consisting of three major modules: PE parser, OOA rule generator, and rule based classifier, and an OOA_Fast_FP-Growth algorithm is adapted to efficiently generate OOA rules for classification.
Abstract
The proliferation of malware has presented a serious threat to the security of computer systems. Traditional signature-based anti-virus systems fail to detect polymorphic/metamorphic and new, previously unseen malicious executables. Data mining methods such as Naive Bayes and Decision Tree have been studied on small collections of executables. In this paper, resting on the analysis of Windows APIs called by PE files, we develop the Intelligent Malware Detection System (IMDS) using Objective-Oriented Association (OOA) mining based classification. IMDS is an integrated system consisting of three major modules: PE parser, OOA rule generator, and rule based classifier. An OOA_Fast_FP-Growth algorithm is adapted to efficiently generate OOA rules for classification. A comprehensive experimental study on a large collection of PE files obtained from the anti-virus laboratory of KingSoft Corporation is performed to compare various malware detection approaches. Promising experimental results demonstrate that the accuracy and efficiency of our IMDS system outperform popular anti-virus software such as Norton AntiVirus and McAfee VirusScan, as well as previous data mining based detection systems which employed Naive Bayes, Support Vector Machine (SVM) and Decision Tree techniques. Our system has already been incorporated into the scanning tool of KingSoft’s Anti-Virus software.

read more

Citations
More filters
Patent

Automated behavioral and static analysis using an instrumented sandbox and machine learning classification for mobile security

TL;DR: In this paper, the authors present a system that allows mobile subscribers, and others, to submit mobile applications to be analyzed for anomalous and malicious behavior using data acquired during the execution of the application within a highly instrumented and controlled environment for which the analysis relies on per-execution as well as comparative aggregate data across many such executions from one or more subscribers.
Journal ArticleDOI

A Survey on Malware Detection Using Data Mining Techniques

TL;DR: There is an urgent need to develop intelligent methods for effective and efficient malware detection from the real and large daily sample collection and a comprehensive investigation on both the feature extraction and the classification/clustering techniques is provided.
Journal ArticleDOI

A comparison of static, dynamic, and hybrid analysis for malware detection

TL;DR: This research trains Hidden Markov Models (HMMs) on both static and dynamic feature sets and compares the resulting detection rates over a substantial number of malware families, finding a fully dynamic approach generally yields the best detection rates.
Journal ArticleDOI

The rise of machine learning for detection and classification of malware: Research developments, trends and challenges

TL;DR: This survey aims at providing a systematic and detailed overview of machine learning techniques for malware detection and in particular, deep learning techniques with special emphasis on deep learning approaches.
Proceedings ArticleDOI

Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification

TL;DR: This paradigm is presented and discussed in the present paper, where emphasis has been given to the phases related to the extraction, and selection of a set of novel features for the effective representation of malware samples.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Book

Data Mining: Concepts and Techniques

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Proceedings ArticleDOI

Mining association rules between sets of items in large databases

TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.
Journal ArticleDOI

Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy

TL;DR: In this article, the maximal statistical dependency criterion based on mutual information (mRMR) was proposed to select good features according to the maximal dependency condition. But the problem of feature selection is not solved by directly implementing mRMR.
Related Papers (5)