How many samples were used for the evaluation of their model?

For the evaluation of their model, 20,000 malware samples from VirusShare [38] and 1,260 from the Malgenome project [37] were used.

Why is it necessary to update the model continuously?

Since the malware detection model should reflect the characteristics of those new applications for accurate and prompt detection, it is necessary to update the model continuously.

What is the reason for the resizing algorithms?

The size of the raw data such as naïve binary files of each application varies greatly, so the resizing algorithms are necessary to provide the fixed sized feature vectors which fit in their neural network model.

What are the methods to verify Android applications to defend against the component hijacking attacks?

CHEX [13], DroidChecker [14], AAPL [15], and Amandroid [16] are methods to verify Android applications to defend against the component hijacking attacks.

What are the main reasons of their feature vector generation method?

To show the effectiveness of their feature vector generation method including feature extraction, the authors conducted experiments to compare their framework with other methods: the native binary-based detection method, the bag-of-words based detection method, and an open-sourced opcode sequence-based detection method [30].

(Open Access) A Multimodal Deep Learning Method for Android Malware Detection Using Various Features (2019) | TaeGuen Kim

A Multimodal Deep Learning Method for Android Malware Detection

using Various Features

Kim, T., Kang, B., Rho, M., Sezer, S., & Im, E. G. (2018). A Multimodal Deep Learning Method for Android

Malware Detection using Various Features.

IEEE Transactions on Information Forensics and Security

(3),

773-788. https://doi.org/10.1109/TIFS.2018.2866319

Published in:

IEEE Transactions on Information Forensics and Security

Document Version:

Peer reviewed version

Queen's University Belfast - Research Portal:

Link to publication record in Queen's University Belfast Research Portal

Publisher rights

This work is made available online in accordance with the publisher’s policies. Please refer to any applicable terms of use of the publisher.

General rights

Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other

copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated

with these rights.

Take down policy

The Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made to

ensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in the

Research Portal that you believe breaches copyright or violates any law, please contact openaccess@qub.ac.uk.

Download date:10. Aug. 2022

T-IFS-07942-2017



Abstract— With the widespread use of smartphones, the

number of malware has been increasing exponentially. Among

smart devices, Android devices are the most targeted devices by

malware because of their high popularity. This paper proposes a

novel framework for Android malware detection. Our framework

uses various kinds of features to reflect the properties of Android

applications from various aspects, and the features are refined

using our existence-based or similarity-based feature extraction

method for effective feature representation on malware detection.

Besides, a multimodal deep learning method is proposed to be used

as a malware detection model. This paper is the first study of the

multimodal deep learning to be used in the Android malware

detection. With our detection model, it was possible to maximize

the benefits of encompassing multiple feature types. To evaluate

the performance, we carried out various experiments with a total

of 41,260 samples. We compared the accuracy of our model with

that of other deep neural network models. Furthermore, we

evaluated our framework in various aspects including the

efficiency in model updates, the usefulness of diverse features, and

our feature representation method. In addition, we compared the

performance of our framework with those of other existing

methods including deep learning based methods.

Index Terms—Android malware, malware detection, intrusion

detection, machine learning, neural network.

I. INTRODUCTION

ith the growing popularity of mobile devices such as

smartphones or tablets, attacks on the mobile devices

have been increasing. Mobile malware is one of the most

dangerous threats which cause various security incidents as

well as financial damages. According to the G DATA report [1]

in 2017, security experts discovered about 750,000 new

Android malware during the first quarter of 2017. It is expected

that a large number of mobile malware will keep developed and

spread to commit various cybercrimes on mobile devices.

Android is a mobile operating system that is most targeted by

This paper was first submitted on Oct. 18

, 2017. This research was

supported by the MSIT(Ministry of Science, ICT), Korea, under the

ITRC(Information Technology Research Center) support program (IITP-2018-

2013-1-00881) supervised by the IITP(Institute for Information &

communication Technology Promotion). This work was supported by Institute

for Information & communications Technology Promotion (IITP) grant funded

by the Korea government (MSIT) (No.2017-0-00388, Development of Defense

Technologies against Ransomware). This work was supported by the National

Research Foundation of Korea(NRF) grant funded by the Korea

government(MSIP) (No. NRF-2016R1A2B4015254).

TaeGuen Kim is with the Department of Computer and Software, Hanyang

University, Seoul, 04763 Korea (e-mail: cloudio17@hanyang.ac.kr).

mobile malware because of the popularity of Android devices.

In addition to the number of Android devices, there is another

reason that leads malware authors to develop Android malware.

The reason is that the Android operating system allows users to

install applications downloaded from third-party markets and

attackers can seduce or mislead Android users to download

malicious or suspicious applications from attackers’ servers.

To mitigate the attacks by Android malware, various research

approaches have been proposed so far. The malware detection

approaches can be classified into two categories; static analysis

based detection [2-19] and dynamic analysis based detection

[20-24]. The static analysis based methods use syntactic

features that can be extracted without executing an application,

whereas the dynamic analysis based methods use semantic

features that can be monitored when an application is executed

in a controlled environment. Static analysis has an advantage

that it is unnecessary to set the execution environments, and the

computational overheads for static analysis are relatively low.

Dynamic analysis has an advantage that it is possible to handle

malicious applications which use some obfuscation techniques

such as code encryption or packing.

In this paper, we assume that obfuscated malware is

processed by dynamic analysis based methods, and we focus on

the development of a static analysis based method to distinguish

between malware and benign applications. This paper proposes

a novel malware detection framework based on various static

features. Our framework is flexible to add a new type of features,

so, it is possible to utilize dynamic features in the future.

There are many previous works that are related to Android

malware detections, but most of the previous studies use only

limited types of features to detect malware. Each type of feature

can represent only a few properties of applications. On the other

hand, we propose a framework to detect malware using many

feature information to reflect various characteristics of

applications in various aspects. Our proposed framework first

extracts and processes multiple feature types, and refines them

Boojoong Kang is with the Centre for Secure Information Technologies

(CSIT), Queen’s University of Belfast, Belfast, UK (e-mail:

B.Kang@qub.ac.uk).

Mina Rho is with the Department of Computer Science and Engineering,

Hanyang University, Seoul, 04763 Korea (e-mail: minarho@hanyang.ac.kr).

Sakir Sezer is with the Centre for Secure Information Technologies (CSIT),

Queen’s University of Belfast, Belfast, UK (e-mail: s.sezer@qub.ac.uk).

Eul Gyu Im is with the Department of Computer Science and Engineering,

Hanyang University, Seoul, 04763 Korea (e-mail: imeg@hanyang.ac.kr).

A Multimodal Deep Learning Method for Android

Malware Detection using Various Features

TaeGuen Kim, BooJoong Kang, Mina Rho, Sakir Sezer and Eul Gyu Im

T-IFS-07942-2017

using our feature vector generation methods. Our feature vector

generation method consists of an existence-based method and a

similarity-based method, and these are very effective to

distinguish between malware and benign applications even

though malware has many similar properties of benign

applications. In addition, our framework uses a classification

model that implies the degree of classification according to their

importance. Among many useful classification algorithms, we

concluded that the deep learning algorithm is the suitable

classification algorithm for our framework that uses various

types of feature.

We propose a multimodal deep neural network model to fit

the features with different properties. The multimodal deep

learning method is generally utilized to make the neural

network to reflect the properties with different kinds of feature.

For example, the multimodal deep learning method was used to

recognize human speech using both voice information and

mouth shape information [48]. The different types of the feature

are inputted and processed in different initial neural networks

separately, and each initial network is connected to a final

neural network to produce the classification results. According

to our survey, our research is the first application of the

multimodal deep learning to the Android malware detection.

We conducted many experiments using our framework with

a large dataset from VirusShare [38] and the well-known small

dataset from the Malgenome project [37]. We measured and

compared the performance of our model with that of the deep

neural network model. In addition, we evaluated our framework

in various aspects including efficiency in model updates, the

usefulness of diverse features and effects of our feature

representation method. According to the comparison results

with other deep learning based methods, we argue that our

framework has good performance on the malware detection.

Our contributions can be summarized as follows:

 We proposed a novel Android malware detection

framework using diverse features that can reflect the

characteristics of Android applications.

 We suggested feature vector generation methods that can

represent malware characteristics effectively even when

malware shares many common properties with benign

applications.

 We introduced how the multimodal neural network can be

applied in malware detection system. Model learning

strategies and an online update method for malware

detection are proposed. To the best of our knowledge, this

research is the first application of the multimodal deep

learning to the Android malware detection.

 We provided various experimental results of our

framework to evaluate the performance in various aspects.

Total seven experiments were conducted in this paper.

The rest of the paper is organized as follows: Section II

explains the overall architecture of our Android malware

detection framework and describes how the framework works

in detail, Section III presents the feature types that are used in

our framework, and the multimodal neural network algorithm

is explained in Section IV. Section V shows the experimental

results to show the performance of our framework, and Section

VI discusses related work, followed by Section VII that

summarizes our research and provides future work of this

ongoing research.

Fig. 1. The overall architecture of the proposed framework

T-IFS-07942-2017

II. PROPOSED FRAMEWORK

Fig.1 shows the overall architecture of our framework, and

our framework uses seven kinds of the feature; String feature,

method opcode feature, method API feature, shared library

function opcode feature, permission feature, component feature,

and environmental feature. Using those features, the seven

corresponding feature vectors are generated first, and then,

among them, the permission/component/predefined setting

feature vectors are merged into one feature vector. Finally, the

five feature vectors are fed to the classification model for

malware detection. The framework conducts four major

processes for the detection; raw data extraction process, feature

extraction process, feature vector generation process, and

detection process. These processes are explained in the next

subsections.

A. Raw Data Extraction Process

The raw data extraction process is performed to make

Android APK (Android Package Kit) files interpretable. To

extract the raw data, an APK file is unzipped, and a manifest

file, a dex file, and shared library files are extracted first. The

manifest file and the dex file are decoded or disassembled by

APKtool [32], and the shared library files (i.e. .so files) in the

package can be disassembled by IDA Pro [33].

B. Feature Extraction Process

The feature extraction process is conducted to obtain the

essential feature data from the raw data. The detailed definition

of feature types is explained in Section III.

First, method opcode features and method API features are

extracted from smali files which are the disassembled results

of a dex file. The smali file is separated into the method

blocks, and, by scanning Dalvik bytecodes, the Dalvik opcode

frequency of each method is calculated. In addition, during the

bytecode scanning, it is checked whether the invocation of the

dangerous APIs exists in the method, and the dangerous API

invocation frequency of each method is calculated. In case of

string features, strings are simply collected from the whole

smali files without considering the method separation.

Shared library function opcode features are extracted from

the instruction sequences of the disassembled code of .so files.

The instruction sequence of each function is scanned to extract

the information of the assembly opcode frequency.

The permission features, the component features, and

environmental features are extracted from the manifest XML

file. While visiting the XML tree nodes, each node’s tag is

checked to confirm whether the node contains the information

about permissions, application components, and so on.

C. Feature Vector Generation Process

The extracted features in the previous process are used to

compose feature vectors. Seven kinds of the feature vector are

generated from extracted features. The seven feature vectors are

divided into two types according to their feature representations:

existence-based feature vectors and similarity-based feature

vectors. The existence-based feature vector is the feature vector

whose elements only represent the existence of features in the

malicious feature database, and examples of these are string,

permission, component and environmental feature vectors. On

the other hand, the similarity-based feature vector is the feature

vectors whose elements are similar to the malware

representatives in the malicious feature database, and method

opcode, method API and shared library function feature vectors

are the similarity-based feature vectors.

The malicious feature database herein is a repository that

contains features and malware representatives of known

malicious applications. The structure of the database is

described in Fig. 5 in APPENDIX B, and each feature is

explained in Section III. In addition, the malware

representatives mean the centroids of the clusters which are

calculated using the K-means clustering algorithm [44].

Algorithms I and II explain in APPENDIX A the processing

flows of the feature generation. First, as explained in Algorithm

I, the existence-based feature generation process is simple. The

feature values in the malicious feature database correspond to

the elements of the feature vector, and every feature value is

searched in the features extracted from input applications. If

there is no certain feature value in the extracted features, its

absence is represented as zero. Otherwise, the existence of the

feature value is represented as one in the vector.

Second, the similarity-based feature vectors are generated as

explained in Algorithm II. The method opcode feature, the

method API feature, and the shared library function opcode

feature used in this feature vector generation process are in the

form of a list of frequencies. The frequency values can vary

considerably, so the features of an input application are first

normalized to fit the feature values in the range of [0, 1]. The

min-max scaling method is used in the normalization [45]. Then,

each malware representative (the centroid of the cluster) in the

malicious feature database is compared with the features of the

input application using the Euclidean distance measure. Among

the distances of each malware representative, the minimum

distance is selected to convert to the similarity, and the

calculated similarity is recorded in the corresponding element

of the feature vector. By recording the highest similarity values

of the multiple malware representatives, the feature vector can

contain similarities to multiple clusters’ centroids which are

computed with known malware applications. Therefore, the

similarity-based feature vector can represent information

whether the input application’s features belong to clusters.

To improve the performance of our framework, we refined

the feature vector with a predefined threshold value. The

similarity values that exceed the predefined similarity threshold

become one. Otherwise, it is set to zero. This refinement

removes the features that are not close enough to a certain

malware representative but have small similarity values, and it

also simplifies the computation in the deep learning process.

D. Detection Process

After all the seven feature vectors are generated in the

previous process, the detection process is conducted to

determine whether the given application is malicious or not.

Before examining the feature vectors with the detection model,

the permission feature vector, the component feature vector,

T-IFS-07942-2017

and the environmental feature vector are merged into a single

feature vector. Therefore, our model gets the five feature

vectors and performs mathematical operations at each layer. If

all operations are conducted completely, the model produces

the estimated label for the given input application.

III. THE DEFINITION OF FEATURES

Diverse features could be helpful to reflect the characteristics

of an application. Even though some features such as

environmental information are not directly related to malicious

activities, these features may contribute to defining the

application characteristics.

Our proposed framework uses the following features:

 String feature

 Method opcode feature

 Method API feature

 Shared library function opcode feature

 Permission feature

 Component feature

 Environmental feature

In our framework, the deep learning algorithm is used to

classify the unknown samples into the malware class or the

benign class. The deep learning algorithm generates a neural

network model that can derive the best classification accuracy

by updating the weight of each neuron input. The degree of

influence of the feature on classification is determined

according to the weight of the neurons affected by the feature.

If there is an insignificant feature in the classification, the

weight of the relevant neurons is reduced. Therefore, each

feature can be used differently by their contributions.

The next subsections explain each feature type that is used in

our framework. It is noted that the features are converted to the

feature vectors to apply them to the neural network.

A. String Feature

The string feature is extracted from a set of string values in

smali files. The feature extraction module collects all operand

values with the types of const-string and const-

string/jumbo. There are also the Dalvik opcodes that move

a reference to a string into a specific register. The number of

strings in an application spans a wide range. If the number of

applications increases, then the number of strings from those

applications will increase explosively. Therefore, strings are

hashed, and the hashed values of strings are applied to the

modular operation. The hash function used in the framework is

the SHA512 hash function.

B. Method opcode and API Feature

Dalvik opcode frequency and API invocation frequency of

methods may imply application behaviors and coding habits of

the developer. For this reason, Dalvik opcode frequency and

API invocation frequency of methods are used to define the

method features. The method opcode frequency can be

calculated by scanning the bytecode in each method. In the case

of the API invocation frequency, the bytecodes for API

invocation are checked to count the API invocations in each

method. To capture malicious behaviors, invocations of only

selected APIs are counted. The APIs that might be used in

malicious activities are investigated manually using the

Android Developer reference pages [50]. Additionally, the

APIs that were introduced in [35] are also added to the selected

API list. According to [35], those selected APIs are useful to

distinguish malware and benign applications.

C. Shared Library Function Opcode Feature

Android provides the Java Native Interface (JNI) and allows

applications to incorporate native libraries. It is well known that

native code defeats Android security mechanisms because

native code is not covered by the security model. For example,

shared library files can be used to hide malicious behaviors or

to avoid countermeasure against attacks. That is the reason why

many malicious applications use the native code to attack the

Android system.

To prevent malware with native code from hiding its

behaviors, our framework defines and uses the shared library

function features in the detection. Similar to the method feature

extraction, ARM opcode frequency and system call invocation

frequency are extracted from native code. While scanning the

disassembled code of each function, the opcodes and system

call invocations in each function are counted.

D. Permission Feature

Android is a privilege-separated operating system, and an

application runs with a unique system identifier. Android

provides a permission-based access control mechanism to

restrict the operations that a process can perform. In addition,

per-URI permissions are used to grant access to specific data.

To perform a certain behavior, an application should request

necessary permissions to Android, and this means that

permissions defined in an application can indicate the behaviors

of an application.

The manifest file in the application includes various

information related to permissions. First, the permissions to be

requested when the application is installed are defined in the

manifest file. Second, security permission that can be used to

limit accesses to specific components is also defined to protect

the application. The permission-related information can be

collected by parsing the <uses-permission> tag and the

<permission> tag in the manifest file. The request

permissions’ names are collected from the <uses-

permission> tag, and the security permissions’ names,

permission groups and protection levels are collected from the

<permission> tag. The extracted request permissions and

security permissions (the tuples of name, permission group, and

protection level) are used as permission features.

E. Component Feature

Application components are the essential building blocks of

an Android application. There are four components in an

Android application; Activity, service, broadcast receiver, and

A Multimodal Deep Learning Method for Android Malware Detection Using Various Features

Figures

Citations

Deep Learning Approach for Intelligent Intrusion Detection System

Robust Intelligent Malware Detection Using Deep Learning

Lucid: A Practical, Lightweight Deep Learning Solution for DDoS Attack Detection

A Survey of Android Malware Detection with Deep Neural Models

LUCID: A Practical, Lightweight Deep Learning Solution for DDoS Attack Detection

References

Adam: A Method for Stochastic Optimization

Dropout: a simple way to prevent neural networks from overfitting

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

Rectified Linear Units Improve Restricted Boltzmann Machines

Machine Learning : A Probabilistic Perspective

Related Papers (5)

DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket.

Significant Permission Identification for Machine-Learning-Based Android Malware Detection

Dissecting Android Malware: Characterization and Evolution

Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs

Droid-Sec: deep learning in android malware detection

Frequently Asked Questions (10)

Q1. What contributions have the authors mentioned in the paper "A multimodal deep learning method for android malware detection using various features" ?

Q2. How many samples were used for the evaluation of their model?

Q3. What are the two types of feature vectors?

Q4. What are the main processes for the detection of Android APK?

Q5. Why is it necessary to update the model continuously?

Q6. What are the permission features used in Android?

Q7. What is the reason for the resizing algorithms?

Q8. What are the methods to verify Android applications to defend against the component hijacking attacks?

Q9. How is the degree of influence of a feature determined?

Q10. What are the main reasons of their feature vector generation method?