scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Cybersecurity in the Era of Data Science: Examining New Adversarial Models

03 May 2019-Vol. 17, Iss: 6, pp 46-53
TL;DR: The ever-increasing volume, variety, and velocity of threats dictates a big data problem in cybersecurity and necessitates deployment of AI and machine-learning algorithms, which introduces a new adversarial model, which is defined and discussed in this article.
Abstract: The ever-increasing volume, variety, and velocity of threats dictates a big data problem in cybersecurity and necessitates deployment of AI and machine-learning (ML) algorithms. The limitations and vulnerabilities of AI/ML systems, combined with complexity of data, introduce a new adversarial model, which is defined and discussed in this article.
Citations
More filters
Journal ArticleDOI
TL;DR: This research comprehensively identifying and analysing cybersecurity assessment methods described in the scientific literature to support researchers and practitioners in choosing the method to be applied in their assessments and to indicate the areas that can be further explored.
Abstract: Cybersecurity assessments are crucial in building the assurance that vital cyberassets are effectively protected from threats. Multiple assessment methods have been proposed during the decades of the cybersecurity field. However, a systematic literature search described in this paper reveals that their reviews are practically missing. Thus, the primary objective of this research was to fulfil this gap by comprehensively identifying and analysing cybersecurity assessment methods described in the scientific literature. A structured research method and transparent criteria were applied for this purpose. As a result, thirty-two methods are presented in this paper. Particular attention is paid to the question of the methods’ applicability in realistic contexts and environments. In that regard, the challenges and limitations associated with the methods’ application as well as potential approaches to addressing them have been indicated. Besides, the paper systematises the terminology and indicates complementary studies which can be helpful during assessments. Finally, the areas that leave space for improvement and directions for further research and development are indicated. The intention is to support researchers and practitioners in choosing the method to be applied in their assessments and to indicate the areas that can be further explored.

27 citations

Book ChapterDOI
01 Jan 2021
TL;DR: This paper will use the CS dataset, and ML techniques will be applied to these datasets to identify the issues, opportunities, and cybersecurity challenges, and provided a framework that will provide insight into ML and DS’s use for protecting cyberspace from CS attacks.
Abstract: Cybersecurity (CS) is one of the critical concerns in today’s fast-paced and interconnected world. Advancement in IoT and other computing technologies had made human life and business easy on one hand, while many security breaches are reported daily. These security breaches cost millions of dollars loss for individuals as well as organizations. Various datasets for cybersecurity are available on the Internet. There is a need to benefit from these datasets by extracting useful information from them to improve cybersecurity. The combination of data science (DS) and machine learning (ML) techniques can improve cybersecurity as machine learning techniques help extract useful information from raw data. In this paper, we have combined DS and ML for improving cybersecurity. We will use the CS dataset, and ML techniques will be applied to these datasets to identify the issues, opportunities, and cybersecurity challenges. As a contribution to research, we have provided a framework that will provide insight into ML and DS’s use for protecting cyberspace from CS attacks.

4 citations

Posted Content
TL;DR: This work proposes using and extracting features from Markov matrices constructed from opcode traces as a low cost feature for unobfuscated and obfuscated malware detection and empirically shows that this approach maintains a high detection rate while consuming less power than similar work.
Abstract: With the increased deployment of IoT and edge devices into commercial and user networks, these devices have become a new threat vector for malware authors. It is imperative to protect these devices as they become more prevalent in commercial and personal networks. However, due to their limited computational power and storage space, especially in the case of battery-powered devices, it is infeasible to deploy state-of-the-art malware detectors onto these systems. In this work, we propose using and extracting features from Markov matrices constructed from opcode traces as a low cost feature for unobfuscated and obfuscated malware detection. We empirically show that our approach maintains a high detection rate while consuming less power than similar work.

2 citations

Journal ArticleDOI
TL;DR: In this paper , the authors review the state of the art in TML research and identify open problems and challenges in the presence of an adversary that may take advantage of such multilateral trade-offs.
Abstract: Model accuracy is the traditional metric employed in machine learning (ML) applications. However, privacy, fairness, and robustness guarantees are crucial as ML algorithms increasingly pervade our lives and play central roles in socially important systems. These four desiderata constitute the pillars of Trustworthy ML (TML) and may mutually inhibit or reinforce each other. It is necessary to understand and clearly delineate the trade-offs among these desiderata in the presence of adversarial attacks. However, threat models for the desiderata are different and the defenses introduced for each leads to further trade-offs in a multilateral adversarial setting (i.e., a setting attacking several pillars simultaneously). The first half of the paper reviews the state of the art in TML research, articulates known multilateral trade-offs, and identifies open problems and challenges in the presence of an adversary that may take advantage of such multilateral trade-offs. The fundamental shortcomings of statistical association-based TML are discussed, to motivate the use of causal methods to achieve TML. The second half of the paper, in turn, advocates the use of causal modeling in TML. Evidence is collected from across the literature that causal ML is well-suited to provide a unified approach to TML. Causal discovery and causal representation learning are introduced as essential stages of causal modeling, and a new threat model for causal ML is introduced to quantify the vulnerabilities introduced through the use of causal methods. The paper concludes with pointers to possible next steps in the development of a causal TML pipeline.

1 citations

Proceedings ArticleDOI
01 Dec 2020
TL;DR: In this paper, the authors propose using and extracting features from Markov matrices constructed from opcode traces as a low cost feature for unobfuscated and obfuscated malware detection.
Abstract: With the increased deployment of IoT and edge devices into commercial and user networks, these devices have become a new threat vector for malware authors. It is imperative to protect these devices as they become more prevalent in commercial and personal networks. However, due to their limited computational power and storage space, especially in the case of battery-powered devices, it is infeasible to deploy state-of-the-art malware detectors onto these systems. In this work, we propose using and extracting features from Markov matrices constructed from opcode traces as a low cost feature for unobfuscated and obfuscated malware detection. We empirically show that our approach maintains a high detection rate while consuming less power than similar work.

1 citations

References
More filters
Posted Content
TL;DR: The authors argue that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, which is supported by new quantitative results while giving the first explanation of the most intriguing fact about adversarial examples: their generalization across architectures and training sets.
Abstract: Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset.

4,967 citations

Posted Content
TL;DR: This paper uses influence functions — a classic technique from robust statistics — to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction.
Abstract: How can we explain the predictions of a black-box model? In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. To scale up influence functions to modern machine learning settings, we develop a simple, efficient implementation that requires only oracle access to gradients and Hessian-vector products. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. On linear models and convolutional neural networks, we demonstrate that influence functions are useful for multiple purposes: understanding model behavior, debugging models, detecting dataset errors, and even creating visually-indistinguishable training-set attacks.

1,492 citations

Posted Content
TL;DR: This paper showed that adversarial training confers robustness to single-step attack methods, while multi-step attacks are somewhat less transferable than single step attack methods and single step attacks are the best for mounting black-box attacks.
Abstract: Adversarial examples are malicious inputs designed to fool machine learning models. They often transfer from one model to another, allowing attackers to mount black box attacks without knowledge of the target model's parameters. Adversarial training is the process of explicitly training a model on adversarial examples, in order to make it more robust to attack or to reduce its test error on clean inputs. So far, adversarial training has primarily been applied to small problems. In this research, we apply adversarial training to ImageNet. Our contributions include: (1) recommendations for how to succesfully scale adversarial training to large models and datasets, (2) the observation that adversarial training confers robustness to single-step attack methods, (3) the finding that multi-step attack methods are somewhat less transferable than single-step attack methods, so single-step attacks are the best for mounting black-box attacks, and (4) resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples, because the adversarial example construction process uses the true label and the model can learn to exploit regularities in the construction process.

1,294 citations

Posted Content
TL;DR: This work demonstrates the effectiveness of the Information-Plane visualization of DNNs and shows that the training time is dramatically reduced when adding more hidden layers, and the main advantage of the hidden layers is computational.
Abstract: Despite their great success, there is still no comprehensive theoretical understanding of learning with Deep Neural Networks (DNNs) or their inner organization. Previous work proposed to analyze DNNs in the \textit{Information Plane}; i.e., the plane of the Mutual Information values that each layer preserves on the input and output variables. They suggested that the goal of the network is to optimize the Information Bottleneck (IB) tradeoff between compression and prediction, successively, for each layer. In this work we follow up on this idea and demonstrate the effectiveness of the Information-Plane visualization of DNNs. Our main results are: (i) most of the training epochs in standard DL are spent on {\emph compression} of the input to efficient representation and not on fitting the training labels. (ii) The representation compression phase begins when the training errors becomes small and the Stochastic Gradient Decent (SGD) epochs change from a fast drift to smaller training error into a stochastic relaxation, or random diffusion, constrained by the training error value. (iii) The converged layers lie on or very close to the Information Bottleneck (IB) theoretical bound, and the maps from the input to any hidden layer and from this hidden layer to the output satisfy the IB self-consistent equations. This generalization through noise mechanism is unique to Deep Neural Networks and absent in one layer networks. (iv) The training time is dramatically reduced when adding more hidden layers. Thus the main advantage of the hidden layers is computational. This can be explained by the reduced relaxation time, as this it scales super-linearly (exponentially for simple diffusion) with the information compression from the previous layer.

1,159 citations

Journal Article
TL;DR: This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of any classification method.
Abstract: After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data. Thus, we get an answer to the question what is the most likely label of a given unseen data point. However, most methods will provide no answer why the model predicted a particular label for a single instance and what features were most influential for that particular instance. The only method that is currently able to provide such explanations are decision trees. This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of any classification method.

888 citations