scispace - formally typeset
Search or ask a question
Author

Fang Ying

Bio: Fang Ying is an academic researcher from National University of Defense Technology. The author has contributed to research in topics: Malware & Feature selection. The author has an hindex of 4, co-authored 4 publications receiving 40 citations.

Papers
More filters
Journal ArticleDOI
Bo Yu1, Fang Ying1, Qiang Yang1, Yong Tang1, Liu Liu1 
TL;DR: This paper conducts a survey on malware behavior description and analysis considering three aspects: malware behavior described, behavior analysis methods, and visualization techniques.
Abstract: Behavior-based malware analysis is an important technique for automatically analyzing and detecting malware, and it has received considerable attention from both academic and industrial communities. By considering how malware behaves, we can tackle the malware obfuscation problem, which cannot be processed by traditional static analysis approaches, and we can also derive the as-built behavior specifications and cover the entire behavior space of the malware samples. Although there have been several works focusing on malware behavior analysis, such research is far from mature, and no overviews have been put forward to date to investigate current developments and challenges. In this paper, we conduct a survey on malware behavior description and analysis considering three aspects: malware behavior description, behavior analysis methods, and visualization techniques. First, existing behavior data types and emerging techniques for malware behavior description are explored, especially the goals, principles, characteristics, and classifications of behavior analysis techniques proposed in the existing approaches. Second, the inadequacies and challenges in malware behavior analysis are summarized from different perspectives. Finally, several possible directions are discussed for future research.

34 citations

Book ChapterDOI
03 Jul 2017
TL;DR: The experimental results demonstrate that the ensemble learning based dynamic malware classification approach can classify malware variants in high F1-score while imposing low classification time in datasets of different scales.
Abstract: Dynamic analysis plays an important role in analyzing malware variants which have used obfuscation, polymorphism and metamorphism techniques. Malware classification is an emerging approach for discriminating different malware families. However, existing malware classification methods have mediocre performance in small scale datasets and some machine learning algorithms have difficulties in handling imbalanced datasets. To solve these issues, we propose an ensemble learning based dynamic malware classification approach aiming at datasets of different scales. Additionally a novel feature selection method is presented to select features with strong discrimination power. In particular, we continue to explore issues in feature representation and feature selection. To verify the efficiency of our approach, we perform a series of comparative experiments with existing feature selection methods, commercial anti-malware tools and current malware classification techniques. The experimental results demonstrate that our approach can classify malware variants in high F1-score while imposing low classification time in datasets of different scales.

13 citations

Patent
25 Aug 2017
TL;DR: Zhang et al. as mentioned in this paper proposed an image matching-based malicious code detection method, which comprises the steps of S1, obtaining training samples corresponding to malicious codes of different family categories, converting the training samples into grayscale images and extracting corresponding image texture features; S2, converting to-be-detected malicious codes into coarse-to-fine images and matching the image textures extracted in the step S2 with the reference sample set corresponding to each family category.
Abstract: The invention discloses an image matching-based malicious code detection method The method comprises the steps of S1, obtaining training samples corresponding to malicious codes of different family categories, converting the training samples into grayscale images and extracting corresponding image texture features; selecting a first reference sample from the training samples of each family category, selecting a second reference sample according to the first reference sample and the difference of the image texture features among the samples, and forming a corresponding reference sample set by the first reference sample and the second reference sample selected from each family category; S2, converting to-be-detected malicious codes into grayscale images and extracting corresponding image texture features; and S3, matching the image texture features extracted in the step S2 with the reference sample set corresponding to each family category, and confirming the family categories of the to-be-detected malicious codes according to matching results The method has the advantages of simple realization, strong robustness, high detection accuracy and good detection effect

8 citations

Patent
24 May 2017
TL;DR: In this article, a multi-dimensional behavior characteristic-based malicious code classification method is proposed, which comprises the steps of S1, obtaining behavior data of a malicious code; S2, calculating a time difference of two adjacent system function calls according to a function call sequence, and constructing a time-difference information table of the system function call; S3, extracting frequency information of the System Function Call (SFC) names; S4, extracting behavior classification frequency information; S5, performing weighted calculation and normalized processing on the time difference information table, the frequency information
Abstract: The invention discloses a multi-dimension behavior characteristic-based malicious code classification method. The method comprises the steps of S1, obtaining behavior data of a malicious code; S2, calculating a time difference of two adjacent system function calls according to a function call sequence, and constructing a time difference information table of the system function calls; S3, extracting frequency information of the system function calls; extracting names of the system function calls from the behavior data, performing statistics on the frequency of each system function call, and establishing a frequency information table of the system function calls; S4, extracting behavior classification frequency information; S5, performing weighted calculation and normalized processing on the time difference information table, the frequency information table of the system function calls, and a frequency information table of behavior types, processing characteristics of the time difference information table, and after the processing, combining the characteristics into a new characteristic space; and S6, performing cross verification on behavior characteristics of all family samples by adopting a typical classification method of machine learning. The method has the advantages of simple principle, easy realization, good effect and the like.

4 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This paper finds that an ensemble of recurrent neural networks are able to predict whether an executable is malicious or benign within the first 5 seconds of execution with 94% accuracy, which enables cyber security endpoint protection to be advanced to use behavioural data for blocking malicious payloads rather than detecting them post-execution and having to repair the damage.

205 citations

Journal ArticleDOI
TL;DR: A dynamic analysis for IoT malware detection (DAIMD) is proposed to reduce damage to IoT devices by detecting both well-known IoT malware and new and variant IoT malware evolved intelligently.
Abstract: Internet of Things (IoT) technology provides the basic infrastructure for a hyper connected society where all things are connected and exchange information through the Internet. IoT technology is fused with 5G and artificial intelligence (AI) technologies for use various fields such as the smart city and smart factory. As the demand for IoT technology increases, security threats against IoT infrastructure, applications, and devices have also increased. A variety of studies have been conducted on the detection of IoT malware to avoid the threats posed by malicious code. While existing models may accurately detect malicious IoT code identified through static analysis, detecting the new and variant IoT malware quickly being generated may become challenging. This paper proposes a dynamic analysis for IoT malware detection (DAIMD) to reduce damage to IoT devices by detecting both well-known IoT malware and new and variant IoT malware evolved intelligently. The DAIMD scheme learns IoT malware using the convolution neural network (CNN) model and analyzes IoT malware dynamically in nested cloud environment. DAIMD performs dynamic analysis on IoT malware in a nested cloud environment to extract behaviors related to memory, network, virtual file system, process, and system call. By converting the extracted and analyzed behavior data into images, the behavior images of IoT malware are classified and trained in the Convolution Neural Network (CNN). DAIMD can minimize the infection damage of IoT devices from malware by visualizing and learning the vast amount of behavior data generated through dynamic analysis.

81 citations

Journal ArticleDOI
TL;DR: A detailed meta-review of the existing surveys related to malware and its detection techniques, showing an arms race between these two sides of a barricade, is presented in this article.
Abstract: Cyber attacks are currently blooming, as the attackers reap significant profits from them and face a limited risk when compared to committing the “classical” crimes. One of the major components that leads to the successful compromising of the targeted system is malicious software. It allows using the victim’s machine for various nefarious purposes, e.g., making it a part of the botnet, mining cryptocurrencies, or holding hostage the data stored there. At present, the complexity, proliferation, and variety of malware pose a real challenge for the existing countermeasures and require their constant improvements. That is why, in this paper we first perform a detailed meta-review of the existing surveys related to malware and its detection techniques, showing an arms race between these two sides of a barricade. On this basis, we review the evolution of modern threats in the communication networks, with a particular focus on the techniques employing information hiding. Next, we present the bird’s eye view portraying the main development trends in detection methods with a special emphasis on the machine learning techniques. The survey is concluded with the description of potential future research directions in the field of malware detection.

63 citations

Journal ArticleDOI
TL;DR: Two novel techniques; incremental bagging (iBagging) and enhanced semi-random subspace selection (ESRS) are proposed and incorporates them into an ensemble-based detection model and achieved higher detection accuracy than existing solutions.

52 citations

Journal ArticleDOI
01 Oct 2018
TL;DR: This paper presents a semantic aware dynamic malware detection tool, SWORD, which encapsulates the semantics of Android apps in such a way that makes it resilient towards injection-based evasion techniques.
Abstract: Malicious android applications have become more advanced and severe threat to user privacy, confidentiality, integrity, money, and device. The process of malware evolution mainly consists of modifications to existing malware using repackaging of apps employing polymorphism, metamorphism and injecting malicious code. The existing dynamic approaches can handle polymorphism, metamorphism and repacking of apps but failed to address code injection at runtime, as it modifies the control/data flow. In this paper, we present a semantic aware dynamic malware detection tool, SWORD. It encapsulates the semantics of Android apps in such a way that makes it resilient towards injection-based evasion techniques. The intuition behind specifying the semantics of apps lies in applying Asymptotic Equipartition Property (AEP) inherited from information theory domain. The semantics of the app are captured using a sequence of system-calls. To assess the efficacy of SWORD, we carried out comprehensive experiments on 6000 execution traces of 2000 applications (1000 malware apps belonging to 119 different families and 1000 benign apps, selected randomly from 12,000 Google Play store apps). We obtain a detection accuracy of 94.2%. Moreover, we show that SWORD can cope with the code injection based evasion techniques.

25 citations