scispace - formally typeset
Search or ask a question
Book ChapterDOI

Cybersecurity for Data Science: Issues, Opportunities, and Challenges

TL;DR: This paper will use the CS dataset, and ML techniques will be applied to these datasets to identify the issues, opportunities, and cybersecurity challenges, and provided a framework that will provide insight into ML and DS’s use for protecting cyberspace from CS attacks.
Abstract: Cybersecurity (CS) is one of the critical concerns in today’s fast-paced and interconnected world. Advancement in IoT and other computing technologies had made human life and business easy on one hand, while many security breaches are reported daily. These security breaches cost millions of dollars loss for individuals as well as organizations. Various datasets for cybersecurity are available on the Internet. There is a need to benefit from these datasets by extracting useful information from them to improve cybersecurity. The combination of data science (DS) and machine learning (ML) techniques can improve cybersecurity as machine learning techniques help extract useful information from raw data. In this paper, we have combined DS and ML for improving cybersecurity. We will use the CS dataset, and ML techniques will be applied to these datasets to identify the issues, opportunities, and cybersecurity challenges. As a contribution to research, we have provided a framework that will provide insight into ML and DS’s use for protecting cyberspace from CS attacks.
Citations
More filters
Proceedings ArticleDOI
16 Feb 2022
TL;DR: In this article , the authors used Logistic Regression algorithm over Gaussian algorithm to detect spam content over the internet and social media using Logistic regression algorithm over GAussian algorithm.
Abstract: Aim: To detect the spam content over the internet and social media using Logistic Regression algorithm over Gaussian algorithm. Methods and Materials: Detection of spam content messages are performed using Logistic Regression algorithm and Gaussian algorithm (sample size=20) Where values are taken randomly. G-power was maintained to be 80%. Results and Discussion: This article is an attempt to improve the accuracy of spam content detection using the Logistic Regression algorithm, a machine learning algorithm. The AI based Application avoids overfitting. The proposed model has improved accuracy of 95% with p value which is less than 0.03(p<0.05) in spam detection than Gaussian algorithm having accuracy of 93%. Conclusion: The outcomes of the proposed model Logistic regression algorithm was compared with the Gaussian algorithm. The proposed model Logistic regression algorithm was compared with the Gaussian algorithm. The proposed algorithm seems to have higher accuracy than the Gaussian algorithm.

1 citations

Proceedings ArticleDOI
16 Feb 2022
TL;DR: This article is an attempt to improve the accuracy of spam content detection using the Logistic Regression algorithm, a machine learning algorithm that seems to have higher accuracy than the Gaussian algorithm.
Abstract: Aim: To detect the spam content over the internet and social media using Logistic Regression algorithm over Gaussian algorithm. Methods and Materials: Detection of spam content messages are performed using Logistic Regression algorithm and Gaussian algorithm (sample size=20) Where values are taken randomly. G-power was maintained to be 80%. Results and Discussion: This article is an attempt to improve the accuracy of spam content detection using the Logistic Regression algorithm, a machine learning algorithm. The AI based Application avoids overfitting. The proposed model has improved accuracy of 95% with p value which is less than 0.03(p<0.05) in spam detection than Gaussian algorithm having accuracy of 93%. Conclusion: The outcomes of the proposed model Logistic regression algorithm was compared with the Gaussian algorithm. The proposed model Logistic regression algorithm was compared with the Gaussian algorithm. The proposed algorithm seems to have higher accuracy than the Gaussian algorithm.

1 citations

Journal ArticleDOI
20 Jan 2023-Sensors
TL;DR: In this paper , the authors proposed a methodology for the generation of synthetic samples of malicious Portable Executable binaries and DoS cyber-attacks via a reinforcement learning engine, which learns from a baseline of different malware families and cyber-attack network properties, resulting in new, mutated and highly functional samples.
Abstract: In recent years, cybersecurity has been strengthened through the adoption of processes, mechanisms and rapid sources of indicators of compromise in critical areas. Among the most latent challenges are the detection, classification and eradication of malware and Denial of Service Cyber-Attacks (DoS). The literature has presented different ways to obtain and evaluate malware- and DoS-cyber-attack-related instances, either from a technical point of view or by offering ready-to-use datasets. However, acquiring fresh, up-to-date samples requires an arduous process of exploration, sandbox configuration and mass storage, which may ultimately result in an unbalanced or under-represented set. Synthetic sample generation has shown that the cost associated with setting up controlled environments and time spent on sample evaluation can be reduced. Nevertheless, the process is performed when the observations already belong to a characterized set, totally detached from a real environment. In order to solve the aforementioned, this work proposes a methodology for the generation of synthetic samples of malicious Portable Executable binaries and DoS cyber-attacks. The task is performed via a Reinforcement Learning engine, which learns from a baseline of different malware families and DoS cyber-attack network properties, resulting in new, mutated and highly functional samples. Experimental results demonstrate the high adaptability of the outputs as new input datasets for different Machine Learning algorithms.

1 citations

Proceedings ArticleDOI
16 Feb 2022
TL;DR: A few of the key benefits of maintaining information integrity, and the risks that are associated with the technology industry these days, as well as the healthcare industry in particular, if it is not strictly followed are examined.
Abstract: For any professional, maintaining data integrity is a demanding task. The purpose of this study is to develop a list of research efforts in the field of healthcare data integrity. Through assault statistics, the purpose of data safety issues in healthcare is highlighted in this study. Data integrity breaches are perhaps a greater serious threat to the healthcare business than data theft, as they might allow attackers to change anything in the record. By their very nature, Breach of healthcare data integrity is harder to identify, and in many situations, the true consequences of such hacks have yet to be determined. This survey examines a few of the key benefits of maintaining information integrity, and the risks that are associated with the technology industry these days, as well as the healthcare industry in particular, if it is not strictly followed.
References
More filters
Journal ArticleDOI
TL;DR: This paper argues that, although there is a substantial overlap between cyber security and information security, these two concepts are not totally analogous and posits that cyber security goes beyond the boundaries of traditional information security to include not only the protection of information resources, but also that of other assets, including the person him/herself.
Abstract: The term cyber security is often used interchangeably with the term information security. This paper argues that, although there is a substantial overlap between cyber security and information security, these two concepts are not totally analogous. Moreover, the paper posits that cyber security goes beyond the boundaries of traditional information security to include not only the protection of information resources, but also that of other assets, including the person him/herself. In information security, reference to the human factor usually relates to the role(s) of humans in the security process. In cyber security this factor has an additional dimension, namely, the humans as potential targets of cyber attacks or even unknowingly participating in a cyber attack. This additional dimension has ethical implications for society as a whole, since the protection of certain vulnerable groups, for example children, could be seen as a societal responsibility.

660 citations

Journal ArticleDOI
TL;DR: N-BaIoT as discussed by the authors is a network-based anomaly detection method for the IoT that extracts behavior snapshots of the network and uses deep autoencoders to detect anomalous network traffic from compromised IoT devices.
Abstract: The proliferation of IoT devices that can be more easily compromised than desktop computers has led to an increase in IoT-based botnet attacks. To mitigate this threat, there is a need for new methods that detect attacks launched from compromised IoT devices and that differentiate between hours- and milliseconds-long IoT-based attacks. In this article, we propose a novel network-based anomaly detection method for the IoT called N-BaIoT that extracts behavior snapshots of the network and uses deep autoencoders to detect anomalous network traffic from compromised IoT devices. To evaluate our method, we infected nine commercial IoT devices in our lab with two widely known IoT-based botnets, Mirai and BASHLITE. The evaluation results demonstrated our proposed methods ability to accurately and instantly detect the attacks as they were being launched from the compromised IoT devices that were part of a botnet.

603 citations

Journal ArticleDOI
TL;DR: A novel network-based anomaly detection method for the IoT called N-BaIoT that extracts behavior snapshots of the network and uses deep autoencoders to detect anomalous network traffic from compromised IoT devices.
Abstract: The proliferation of IoT devices which can be more easily compromised than desktop computers has led to an increase in the occurrence of IoT based botnet attacks. In order to mitigate this new threat there is a need to develop new methods for detecting attacks launched from compromised IoT devices and differentiate between hour and millisecond long IoTbased attacks. In this paper we propose and empirically evaluate a novel network based anomaly detection method which extracts behavior snapshots of the network and uses deep autoencoders to detect anomalous network traffic emanating from compromised IoT devices. To evaluate our method, we infected nine commercial IoT devices in our lab with two of the most widely known IoT based botnets, Mirai and BASHLITE. Our evaluation results demonstrated our proposed method's ability to accurately and instantly detect the attacks as they were being launched from the compromised IoT devices which were part of a botnet.

407 citations

Journal ArticleDOI
TL;DR: This paper focuses and briefly discusses on cybersecurity data science, where the data is being gathered from relevant cybersecurity sources, and the analytics complement the latest data-driven patterns for providing more effective security solutions.
Abstract: In a computing context, cybersecurity is undergoing massive shifts in technology and its operations in recent days, and data science is driving the change. Extracting security incident patterns or insights from cybersecurity data and building corresponding data-driven model, is the key to make a security system automated and intelligent. To understand and analyze the actual phenomena with data, various scientific methods, machine learning techniques, processes, and systems are used, which is commonly known as data science. In this paper, we focus and briefly discuss on cybersecurity data science, where the data is being gathered from relevant cybersecurity sources, and the analytics complement the latest data-driven patterns for providing more effective security solutions. The concept of cybersecurity data science allows making the computing process more actionable and intelligent as compared to traditional ones in the domain of cybersecurity. We then discuss and summarize a number of associated research issues and future directions. Furthermore, we provide a machine learning based multi-layered framework for the purpose of cybersecurity modeling. Overall, our goal is not only to discuss cybersecurity data science and relevant methods but also to focus the applicability towards data-driven intelligent decision making for protecting the systems from cyber-attacks.

240 citations

Journal ArticleDOI
TL;DR: This research review will analyze smart home approaches, challenges and will suggest possible solutions for them and illustrate open issues that still need to be addressed.
Abstract: The smart home is considered as an essential domain in Internet of Things (IoT) applications, it is an interconnected home where all types of things interact with each other via the Internet. This helps to automate the home by making it smart and interconnected. However, at the same time, it raises a great concern of the privacy and security for the users due to its capability to be controlled remotely. Hence, the rapid technologically growth of IoT raises abundant challenges such as how to provide the home users with safe and secure services keeping privacy in the account and how to manage the smart home successfully under the controlled condition to avoid any further secrecy or theft of personal data. A number of the research papers are available to address these critical issues, researchers presented different approaches to overcome these stated issues. This research review will analyze smart home approaches, challenges and will suggest possible solutions for them and illustrate open issues that still need to be addressed.

165 citations