scispace - formally typeset
Search or ask a question
Author

Faisal Saeed

Bio: Faisal Saeed is an academic researcher from Taibah University. The author has contributed to research in topics: Computer science & Consensus clustering. The author has an hindex of 12, co-authored 117 publications receiving 599 citations. Previous affiliations of Faisal Saeed include Technical University of Dortmund & Community College of Philadelphia.


Papers
More filters
Journal ArticleDOI
TL;DR: Extreme gradient boosting (Xgboost), which is an ensemble of Classification and Regression Tree and a variant of the Gradient Boosting Machine, was investigated for the prediction of biological activity based on quantitative description of the compound’s molecular structure and showed remarkable performance on both high and low diversity datasets.
Abstract: Following the explosive growth in chemical and biological data, the shift from traditional methods of drug discovery to computer-aided means has made data mining and machine learning methods integral parts of today's drug discovery process. In this paper, extreme gradient boosting (Xgboost), which is an ensemble of Classification and Regression Tree (CART) and a variant of the Gradient Boosting Machine, was investigated for the prediction of biological activity based on quantitative description of the compound's molecular structure. Seven datasets, well known in the literature were used in this paper and experimental results show that Xgboost can outperform machine learning algorithms like Random Forest (RF), Support Vector Machines (LSVM), Radial Basis Function Neural Network (RBFN) and Naive Bayes (NB) for the prediction of biological activities. In addition to its ability to detect minority activity classes in highly imbalanced datasets, it showed remarkable performance on both high and low diversity datasets.

183 citations

Journal ArticleDOI
TL;DR: A misbehavior-aware on-demand collaborative intrusion detection system (MA-CIDS) based on the concept of distributed ensemble learning that performs better than the other existing models in terms of effectiveness and efficiency for VANET.
Abstract: Vehicular ad hoc networks (VANETs) play an important role as enabling technology for future cooperative intelligent transportation systems (CITSs). Vehicles in VANETs share real-time information about their movement state, traffic situation, and road conditions. However, VANETs are susceptible to the cyberattacks that create life threatening situations and/or cause road congestion. Intrusion detection systems (IDSs) that rely on the cooperation between vehicles to detect intruders, were the most suggested security solutions for VANET. Unfortunately, existing cooperative IDSs (CIDSs) are vulnerable to the legitimate yet compromised collaborators that share misleading and manipulated information and disrupt the IDSs’ normal operation. As such, this paper proposes a misbehavior-aware on-demand collaborative intrusion detection system (MA-CIDS) based on the concept of distributed ensemble learning. That is, vehicles individually use the random forest algorithm to train local IDS classifiers and share their locally trained classifiers on-demand with the vehicles in their vicinity, which reduces the communication overhead. Once received, the performance of the classifiers is evaluated using the local testing dataset in the receiving vehicle. The evaluation values are used as a trustworthiness factor and used to rank the received classifiers. The classifiers that deviate much from the box-and-whisker plot lower boundary are excluded from the set of the collaborators. Then, each vehicle constructs an ensemble of weighted random forest-based classifiers that encompasses the locally and remotely trained classifiers. The outputs of the classifiers are aggregated using a robust weighted voting scheme. Extensive simulations were conducted utilizing the network security laboratory-knowledge discovery data mining (NSL-KDD) dataset to evaluate the performance of the proposed MA-CIDS model. The obtained results show that MA-CIDS performs better than the other existing models in terms of effectiveness and efficiency for VANET.

55 citations

Journal ArticleDOI
TL;DR: A systematic literature review is presented to analyze the existing published literature regarding anomaly-based intrusion detection, using deep learning techniques in securing IoT environments and finds that supervised deep learning Techniques offer better performance, compared to unsupervised and semi-supervised learning.
Abstract: The Internet of Things (IoT) concept has emerged to improve people’s lives by providing a wide range of smart and connected devices and applications in several domains, such as green IoT-based agriculture, smart farming, smart homes, smart transportation, smart health, smart grid, smart cities, and smart environment. However, IoT devices are at risk of cyber attacks. The use of deep learning techniques has been adequately adopted by researchers as a solution in securing the IoT environment. Deep learning has also successfully been implemented in various fields, proving its superiority in tackling intrusion detection attacks. Due to the limitation of signature-based detection for unknown attacks, the anomaly-based Intrusion Detection System (IDS) gains advantages to detect zero-day attacks. In this paper, a systematic literature review (SLR) is presented to analyze the existing published literature regarding anomaly-based intrusion detection, using deep learning techniques in securing IoT environments. Data from the published studies were retrieved from five databases (IEEE Xplore, Scopus, Web of Science, Science Direct, and MDPI). Out of 2116 identified records, 26 relevant studies were selected to answer the research questions. This review has explored seven deep learning techniques practiced in IoT security, and the results showed their effectiveness in dealing with security challenges in the IoT ecosystem. It is also found that supervised deep learning techniques offer better performance, compared to unsupervised and semi-supervised learning. This analysis provides an insight into how the use of data types and learning methods will affect the performance of deep learning techniques for further contribution to enhancing a novel model for anomaly intrusion detection and prediction.

50 citations

Journal ArticleDOI
TL;DR: In this paper, the authors developed a model by integrating two theoretical models, Theory of Planned Behavior and Norm Activation Theory, to explore individual factors that influence decision makers in manufacturing sector in Malaysia to adopt Green IT via the mediation of personal norms.
Abstract: Green IT has attracted policy makers and IT managers within organizations to use IT resources in cost-effective and energy-efficient ways. Investigating the factors that influence decision-makers’ intention towards the adoption of Green IT is important in the development of strategies that promote the organizations to use Green IT. Therefore, the objective of this study stands to understand potential factors that drive decisions makers in Malaysian manufacturing sector to adopt Green IT. This research accordingly developed a model by integrating two theoretical models, Theory of Planned Behavior and Norm Activation Theory, to explore individual factors that influence decision’ makers in manufacturing sector in Malaysia to adopt Green IT via the mediation of personal norms. Accordingly, to determine predictive factors that influence managerial intention toward Green IT adoption, the researchers conducted a comprehensive literature review. The data was collected from 183 decision-makers from Malaysian manufacturing sector and analyzed by Structural Equation Modelling. This research provides important preliminary insights in understanding the most significant factors that determined managerial intention towards Green IT adoption. The model of Green IT adoption explained factors which encourages individual decision-makers in the Malaysian organizations to adopt Green IT initiatives for environment sustainability.

41 citations

Journal ArticleDOI
TL;DR: The experimental result demonstrates that an optimized SVM-PSO algorithm can effectively forecast the future price of cryptocurrency thus outperforms the single SVM algorithms.

39 citations


Cited by
More filters
Journal ArticleDOI
01 Jan 2015-Methods
TL;DR: This review focuses on commonly used fingerprint algorithms, their usage in virtual screening, and the software packages and online tools that provide these algorithms.

491 citations

Journal Article
TL;DR: In this article, the authors analyzed the electronic structure and optical properties of perovskite solar cells based on CH3NH3PbI3 with the quasiparticle self-consistent GW approximation.
Abstract: The performance of organometallic perovskite solar cells has rapidly surpassed those of both traditional dye-sensitized and organic photovoltaics, e.g. solar cells based on CH3NH3PbI3 have recently reached 18% conversion efficiency. We analyze its electronic structure and optical properties within the quasiparticle self-consistent GW approximation (QSGW ). Quasiparticle self-consistency is essential for an accurate description of the band structure: bandgaps are much larger than what is predicted by the local density approximation (LDA) or GW based on the LDA. Several characteristics combine to make the electronic structure of this material unusual. First, there is a strong driving force for ferroelectricity, as a consequence the polar organic moiety CH3NH3. The moiety is only weakly coupled to the PbI3 cage; thus it can rotate give rise to ferroelectric domains. This in turn will result in internal junctions that may aid separation of photoexcited electron and hole pairs, and may contribute to the current-voltage hysteresis found in perovskite solar cells. Second, spin orbit modifies both valence band and conduction band dispersions in a very unusual manner: both get split at the R point into two extrema nearby. This can be interpreted in terms of a large Dresselhaus term, which vanishes at R but for small excursions about R varies linearly in k. Conduction bands (Pb 6p character) and valence bands (I 5p) are affected differently; moreover the splittings vary with the orientation of the moiety. We will show how the splittings, and their dependence on the orientation of the moiety through the ferroelectric effect, have important consequences for both electronic transport and the optical properties of this material.

418 citations

Journal ArticleDOI
TL;DR: A comprehensive comparison between XGBoost, LightGBM, CatBoost, random forests and gradient boosting has been performed and indicates that CatBoost obtains the best results in generalization accuracy and AUC in the studied datasets although the differences are small.
Abstract: The family of gradient boosting algorithms has been recently extended with several interesting proposals (i.e. XGBoost, LightGBM and CatBoost) that focus on both speed and accuracy. XGBoost is a scalable ensemble technique that has demonstrated to be a reliable and efficient machine learning challenge solver. LightGBM is an accurate model focused on providing extremely fast training performance using selective sampling of high gradient instances. CatBoost modifies the computation of gradients to avoid the prediction shift in order to improve the accuracy of the model. This work proposes a practical analysis of how these novel variants of gradient boosting work in terms of training speed, generalization performance and hyper-parameter setup. In addition, a comprehensive comparison between XGBoost, LightGBM, CatBoost, random forests and gradient boosting has been performed using carefully tuned models as well as using their default settings. The results of this comparison indicate that CatBoost obtains the best results in generalization accuracy and AUC in the studied datasets although the differences are small. LightGBM is the fastest of all methods but not the most accurate. Finally, XGBoost places second both in accuracy and in training speed. Finally an extensive analysis of the effect of hyper-parameter tuning in XGBoost, LightGBM and CatBoost is carried out using two novel proposed tools.

375 citations