scispace - formally typeset
Search or ask a question

Showing papers in "Eurasip Journal on Information Security in 2019"


Journal ArticleDOI
TL;DR: A clustering-enhanced transfer learning approach, called CeHTL, which can automatically find the relation between the new attack and known attack, and several conventional classification models such as decision trees, random forests, KNN, and other novel transfer learning approaches as strong baselines performed best.
Abstract: Network attacks are serious concerns in today’s increasingly interconnected society. Recent studies have applied conventional machine learning to network attack detection by learning the patterns of the network behaviors and training a classification model. These models usually require large labeled datasets; however, the rapid pace and unpredictability of cyber attacks make this labeling impossible in real time. To address these problems, we proposed utilizing transfer learning for detecting new and unseen attacks by transferring the knowledge of the known attacks. In our previous work, we have proposed a transfer learning-enabled framework and approach, called HeTL, which can find the common latent subspace of two different attacks and learn an optimized representation, which was invariant to attack behaviors’ changes. However, HeTL relied on manual pre-settings of hyper-parameters such as relativeness between the source and target attacks. In this paper, we extended this study by proposing a clustering-enhanced transfer learning approach, called CeHTL, which can automatically find the relation between the new attack and known attack. We evaluated these approaches by stimulating scenarios where the testing dataset contains different attack types or subtypes from the training set. We chose several conventional classification models such as decision trees, random forests, KNN, and other novel transfer learning approaches as strong baselines. Results showed that proposed HeTL and CeHTL improved the performance remarkably. CeHTL performed best, demonstrating the effectiveness of transfer learning in detecting new network attacks.

90 citations


Journal ArticleDOI
TL;DR: This paper investigates a real-world use case from the energy domain, where customers trade portions of their photovoltaic power plant via a blockchain, and introduces the core concepts of blockchain technology and implements a fully custom, private, and permissioned blockchain from scratch.
Abstract: Blockchains are proposed for many application domains apart from financial transactions. While there are generic blockchains that can be molded for specific use cases, they often lack a lightweight and easy-to-customize implementation. In this paper, we introduce the core concepts of blockchain technology and investigate a real-world use case from the energy domain, where customers trade portions of their photovoltaic power plant via a blockchain. This does not only involve blockchain technology, but also requires user interaction. Therefore, a fully custom, private, and permissioned blockchain is implemented from scratch. We evaluate and motivate the need for blockchain technology within this use case, as well as the desired properties of the system. We then describe the implementation and the insights from our implementation in detail, serving as a guide for others and to show potential opportunities and pitfalls when implementing a blockchain from scratch.

75 citations


Journal ArticleDOI
TL;DR: A deep learning framework by utilizing the bi-directional recurrent neural networks with long short-term memory, dubbed BRNN-LSTM is developed, which achieves a significantly higher prediction accuracy when compared with the statistical approach.
Abstract: Like how useful weather forecasting is, the capability of forecasting or predicting cyber threats can never be overestimated. Previous investigations show that cyber attack data exhibits interesting phenomena, such as long-range dependence and high nonlinearity, which impose a particular challenge on modeling and predicting cyber attack rates. Deviating from the statistical approach that is utilized in the literature, in this paper we develop a deep learning framework by utilizing the bi-directional recurrent neural networks with long short-term memory, dubbed BRNN-LSTM. Empirical study shows that BRNN-LSTM achieves a significantly higher prediction accuracy when compared with the statistical approach.

48 citations


Journal ArticleDOI
TL;DR: The intuition behind the system is that multi-class classification is quite difficult compared to binary classification, so, it is divided into multiple binary classifications and implemented in parallel using GPUs to perform intrusion detection in real time.
Abstract: Recent advances in intrusion detection systems based on machine learning have indeed outperformed other techniques, but struggle with detecting multiple classes of attacks with high accuracy. We propose a method that works in three stages. First, the ExtraTrees classifier is used to select relevant features for each type of attack individually for each (ELM). Then, an ensemble of ELMs is used to detect each type of attack separately. Finally, the results of all ELMs are combined using a softmax layer to refine the results and increase the accuracy further. The intuition behind our system is that multi-class classification is quite difficult compared to binary classification. So, we divide the multi-class problem into multiple binary classifications. We test our method on the UNSW and KDDcup99 datasets. The results clearly show that our proposed method is able to outperform all the other methods, with a high margin. Our system is able to achieve 98.24% and 99.76% accuracy for multi-class classification on the UNSW and KDDcup99 datasets, respectively. Additionally, we use the weighted extreme learning machine to alleviate the problem of imbalance in classification of attacks, which further boosts performance. Lastly, we implement the ensemble of ELMs in parallel using GPUs to perform intrusion detection in real time.

45 citations


Journal ArticleDOI
TL;DR: Experiments show that overall, the proposed methods provide much better code coverage which in turn leads to more accurate machine learning-based malware detection compared to the state-of- the- art approach.
Abstract: This paper investigates the impact of code coverage on machine learning-based dynamic analysis of Android malware. In order to maximize the code coverage, dynamic analysis on Android typically requires the generation of events to trigger the user interface and maximize the discovery of the run-time behavioral features. The commonly used event generation approach in most existing Android dynamic analysis systems is the random-based approach implemented with the Monkey tool that comes with the Android SDK. Monkey is utilized in popular dynamic analysis platforms like AASandbox, vetDroid, MobileSandbox, TraceDroid, Andrubis, ANANAS, DynaLog, and HADM. In this paper, we propose and investigate approaches based on stateful event generation and compare their code coverage capabilities with the state-of-the-practice random-based Monkey approach. The two proposed approaches are the state-based method (implemented with DroidBot) and a hybrid approach that combines the state-based and random-based methods. We compare the three different input generation methods on real devices, in terms of their ability to log dynamic behavior features and the impact on various machine learning algorithms that utilize the behavioral features for malware detection. Experiments performed using 17,444 applications show that overall, the proposed methods provide much better code coverage which in turn leads to more accurate machine learning-based malware detection compared to the state-of- the- art approach.

30 citations


Journal ArticleDOI
TL;DR: A fine-grain text watermarking method that protects even small portions of the digital content and ensures visual indistinguishability and length preservation, and it is robust to the copy and past of small excerpts of the text.
Abstract: The current online digital world, consisting of thousands of newspapers, blogs, social media, and cloud file sharing services, is providing easy and unlimited access to a large treasure of text contents. Making copies of these text contents is simple and virtually costless. As a result, producers and owners of text content are interested in the protection of their intellectual property (IP) rights. Digital watermarking has become crucially important in the protection of digital contents. Out of all, text watermarking poses many challenges, since text is characterized by a low capacity to embed a watermark and allows only a restricted number of alternative syntactic and semantic permutations. This becomes even harder when authors want to protect not just a whole book or article, but each single sentence or paragraph, a problem well known to copyright law. In this paper, we present a fine-grain text watermarking method that protects even small portions of the digital content. The core method is based on homoglyph characters substitution for latin symbols and whitespaces. It allows to produce a watermarked version of the original text, preserving the anonymity of the users according to the right to privacy. In particular, the embedding and extraction algorithms allow to continuously protect the watermark through the whole document in a fine-grain fashion. It ensures visual indistinguishability and length preservation, meaning that it does not cause overhead to the original document, and it is robust to the copy and past of small excerpts of the text. We use a real dataset of 1.8 million New York articles to evaluate our method. We evaluate and compare the robustness against common attacks, and we propose a new measure for partial copy and paste robustness. The results show the effectiveness of our approach providing an average length of 101 characters needed to embed the watermark and allowing to protect paragraph-long excerpt or smaller the 94.5% of the times.

27 citations


Journal ArticleDOI
TL;DR: A new crowdsource-based system called Click Fraud Crowdsourcing (CFC) that collaborates with both advertisers and ad networks in order to protect both parties from any possible click fraudulent acts is proposed and a new mobile ad charging model is proposed to charge advertisers based on the duration spent in the advertiser’s website.
Abstract: Mobile ads are plagued with fraudulent clicks which is a major challenge for the advertising community. Although popular ad networks use many techniques to detect click fraud, they do not protect the client from possible collusion between publishers and ad networks. In addition, ad networks are not able to monitor the user’s activity for click fraud detection once they are redirected to the advertising site after clicking the ad. We propose a new crowdsource-based system called Click Fraud Crowdsourcing (CFC) that collaborates with both advertisers and ad networks in order to protect both parties from any possible click fraudulent acts. The system benefits from both a global view, where it gathers multiple ad requests corresponding to different ad network-publisher-advertiser combinations, and a local view, where it is able to track the users’ engagement in each advertising website. The results demonstrated that our approach offers a lower false positive rate (0.1) when detecting click fraud as opposed to proposed solutions in the literature, while maintaining a high true positive rate (0.9). Furthermore, we propose a new mobile ad charging model that benefits from our system to charge advertisers based on the duration spent in the advertiser’s website.

16 citations


Journal ArticleDOI
TL;DR: A recommender system that uses text mining techniques, coupled with IntelliSense technology, to recommend fixes for potential vulnerabilities in program code, and it is determined that surveyed participants strongly believed that aRecommender system would help programmers write more secure code.
Abstract: Secure coding is crucial for the design of secure and efficient software and computing systems. However, many programmers avoid secure coding practices for a variety of reasons. Some of these reasons are lack of knowledge of secure coding standards, negligence, and poor performance of and usability issues with existing code analysis tools. Therefore, it is essential to create tools that address these issues and concerns. This article features the proposal, development, and evaluation of a recommender system that uses text mining techniques, coupled with IntelliSense technology, to recommend fixes for potential vulnerabilities in program code. The resulting system mines a large code base of over 1.6 million Java files using the MapReduce methodology, creating a knowledge base for a recommender system that provides fixes for taint-style vulnerabilities. Formative testing and a usability study determined that surveyed participants strongly believed that a recommender system would help programmers write more secure code.

15 citations


Journal ArticleDOI
TL;DR: The results show that metadata filtering in combination with traditional biometric-based matching is indeed a powerful tool for providing reliable, and user-friendly, central authentication services for large user groups.
Abstract: While biometric authentication for commercial use so far mainly has been used for local device unlock use cases, there are great opportunities for using it also for central authentication such as for remote login. However, many current biometric sensors like for instance mobile fingerprint sensors have too large false acceptance rate (FAR) not allowing them, for security reasons, to be used in larger user group for central identification purposes. A straightforward way to avoid this FAR problem is to either request a user unique identifier such as a device identifier or require the user to enter a unique user ID prior to making the biometric matching. Usage of a device identifier does not work when a user desires to authenticate on a previously unused device of a generic type. Furthermore, requiring the user at each login occasion to enter a unique user ID, is not at all user-friendly. To avoid this problem, we in this paper investigate an alternative, most user-friendly approach, for identification in combination with biometric-based authentication using metadata filtering. An evaluation of the adopted approach is carried out using realistic simulations of the Swedish population to assess the feasibility of the proposed system. The results show that metadata filtering in combination with traditional biometric-based matching is indeed a powerful tool for providing reliable, and user-friendly, central authentication services for large user groups.

6 citations


Journal ArticleDOI
TL;DR: An improved network security situation awareness model for IoT is proposed and the sequence kernel support vector machine is obtained and the particle swarm optimization (PSO) method is used to optimize related parameters.
Abstract: The Internet of Things (IoT) is a new technology rapidly developed in various fields in recent years. With the continuous application of the IoT technology in production and life, the network security problem of IoT is increasingly prominent. In order to meet the challenges brought by the development of IoT technology, this paper focuses on network security situational awareness. The network security situation awareness is basic of IoT network security. Situation prediction of network security is a kind of time series forecasting problem in essence. So it is necessary to construct a modification function that is suitable for time series data to revise the kernel function of traditional support vector machine (SVM). An improved network security situation awareness model for IoT is proposed in this paper. The sequence kernel support vector machine is obtained and the particle swarm optimization (PSO) method is used to optimize related parameters. It proves that the method is feasible by collecting the boundary data of a university campus IoT network. Finally, a comparison with the PSO-SVM is made to prove the effectiveness of this method in improving the accuracy of network security situation prediction of IoT. The experimental results show that PSO-time series kernel support vector machine is better than the PSO-Gauss kernel support vector machine in network security situation prediction. The application of the Hadoop platform also enhances the efficiency of data processing.

6 citations


Journal ArticleDOI
TL;DR: A new approach is proposed to quantitatively evaluate the binary detection performance of the biometric personal recognition systems and it is observed that by combining different useful feature-extraction techniques, it is possible to improve the system accuracy.
Abstract: A new approach is proposed to quantitatively evaluate the binary detection performance of the biometric personal recognition systems. The importance of correlation between the overall detection performance and the area under the genuine acceptance rate (GAR) versus false acceptance rate (FAR) graph, commonly known as receiver operating characteristics (ROC) is recognized. Using the ROC curve, relation between GARmin and minimum recognition accuracy is derived, particularly for high security applications (HSA). Finally, effectiveness of any binary recognition system is predicted using three important parameters, namely GARmin, the time required for recognition and computational complexity of the computer processing system. The palm print (PP) modality is used to validate the theoretical basis. It is observed that by combining different useful feature-extraction techniques, it is possible to improve the system accuracy. An optimum algorithm to appropriately choose weights has been suggested, which iteratively enhances the system accuracy. This also improves the effectiveness of the system.

Journal ArticleDOI
TL;DR: A protection scheme is described that preserves integrity of the genomic data in that scenario over a time horizon of 100 years, which shows that privacy-preserving long-term integrity protection of genomic data is resource demanding, but in reach of current and future hardware technology and has negligible costs of storage.
Abstract: Genomic data is crucial in the understanding of many diseases and for the guidance of medical treatments. Pharmacogenomics and cancer genomics are just two areas in precision medicine of rapidly growing utilization. At the same time, whole-genome sequencing costs are plummeting below $ 1000, meaning that a rapid growth in full-genome data storage requirements is foreseeable. While privacy protection of genomic data is receiving growing attention, integrity protection of this long-lived and highly sensitive data much less so.We consider a scenario inspired by future pharmacogenomics, in which a patient’s genome data is stored over a long time period while random parts of it are periodically accessed by authorized parties such as doctors and clinicians. A protection scheme is described that preserves integrity of the genomic data in that scenario over a time horizon of 100 years. During such a long time period, cryptographic schemes will potentially break and therefore our scheme allows to update the integrity protection. Furthermore, integrity of parts of the genomic data can be verified without compromising the privacy of the remaining data. Finally, a performance evaluation and cost projection shows that privacy-preserving long-term integrity protection of genomic data is resource demanding, but in reach of current and future hardware technology and has negligible costs of storage.

Journal ArticleDOI
TL;DR: The spectral function approach of Stanko and Škorić is extended which provides such a fixed-length representation for fingerprints and a helper data system consisting of zero-leakage quantisation followed by the Code Offset Method is constructed.
Abstract: Storage of biometric data requires some form of template protection in order to preserve the privacy of people enrolled in a biometric database. One approach is to use a Helper Data System. Here it is necessary to transform the raw biometric measurement into a fixed-length representation. In this paper, we extend the spectral function approach of Stanko and Skoric (IEEE Workshop on Information Forensics and Security (WIFS), 2017) which provides such a fixed-length representation for fingerprints. First, we introduce a new spectral function that captures different information from the minutia orientations. It is complementary to the original spectral function, and we use both of them to extract information from a fingerprint image. Second, we construct a helper data system consisting of zero-leakage quantisation followed by the Code Offset Method. We show empirical data on matching performance and entropy content. On the negative side, transforming a list of minutiae to the spectral representation degrades the matching performance significantly. On the positive side, adding privacy protection to the spectral representation can be done with little loss of performance.

Journal ArticleDOI
TL;DR: This paper explores the components of the Trusted Computing Base in hardware-supported enclaves, provides a taxonomy and gives an extensive understanding of trade-offs during secure enclave development, and proposes an alternative approach for remote secret-code execution of private algorithms.
Abstract: Many applications are built upon private algorithms, and executing them in untrusted, remote environments poses confidentiality issues. To some extent, these problems can be addressed by ensuring the use of secure hardware in the execution environment; however, an insecure software-stack can only provide limited algorithm secrecy. This paper aims to address this problem, by exploring the components of the Trusted Computing Base (TCB) in hardware-supported enclaves. First, we provide a taxonomy and give an extensive understanding of trade-offs during secure enclave development. Next, we present a case study on existing secret-code execution frameworks; which have bad TCB design due to processing secrets with commodity software in enclaves. This increased attack surface introduces additional footprints on memory that breaks the confidentiality guarantees; as a result, the private algorithms are leaked. Finally, we propose an alternative approach for remote secret-code execution of private algorithms. Our solution removes the potentially untrusted commodity software from the TCB and provides a minimal loader for secret-code execution. Based on our new enclave development paradigm, we demonstrate three industrial templates for cloud applications: ① computational power as a service, ② algorithm querying as a service, and ③ data querying as a service.

Journal ArticleDOI
TL;DR: A Spark-based real-time proactive image tracking protection model (SRPITP) is proposed to monitor the status of images under protection in real time and whenever illegal use is found, an alert will be issued to image owners.
Abstract: With rapid development of the Internet, images are spreading more and more quickly and widely. The phenomenon of image illegal usage emerges frequently, and this has marked impacts on people’s normal life. Therefore, it is of great importance to protect image security and image owner’s rights. At present, most image protection is passive. Most of the time, only when the images had been used illegally and serious adverse consequences had appeared did the image owners discover it. In this paper, a Spark-based real-time proactive image tracking protection model (SRPITP) is proposed to monitor the status of images under protection in real time. Whenever illegal use is found, an alert will be issued to image owners. The model mainly includes image fingerprint extraction module, image crawling module, and image matching module. The experimental results show that in SRPITP, the image matching accuracy rate is above 98.9%, and compared with its stand-alone counterpart, the corresponding time reduction for image extraction and matching are about 58.78% and 61.67%.

Journal ArticleDOI
TL;DR: A new construction of generating hard random lattices with short bases is obtained by using a random matrix whose entries obeyed Gaussian sampling which ensures that the corresponding schemes have a wider application future in cryptography area.
Abstract: This paper first gives a regularity theorem and its corollary. Then, a new construction of generating hard random lattices with short bases is obtained by using this corollary. This construction is from a new perspective and uses a random matrix whose entries obeyed Gaussian sampling which ensures that the corresponding schemes have a wider application future in cryptography area. Moreover, this construction is more specific than the previous constructions, which makes it can be implemented easier in practical applications.