Showing papers on "Random forest published in 2023"

PDF

Open Access

Journal Article•DOI•

Meta‐learning how to forecast time series

[...]

Thiyanga S. Talagala, Rob J. Hyndman, George Athanasopoulos

09 Feb 2023-Journal of Forecasting

TL;DR: A random forest is used to identify the best forecasting method using only time series features and is shown to yield accurate forecasts comparable to several benchmarks and other commonly used automated approaches of time series forecasting.

...read moreread less

Abstract: A crucial task in time series forecasting is the identification of the most suitable forecasting method. We present a general framework for forecast-model selection using meta-learning. A random forest is used to identify the best forecasting method using only time series features. The framework is evaluated using time series from the M1 and M3 competitions and is shown to yield accurate forecasts comparable to several benchmarks and other commonly used automated approaches of time series forecasting. A key advantage of our proposed framework is that the time-consuming process of building a classifier is handled in advance of the forecasting task at hand.

...read moreread less

47 citations

Journal Article•DOI•

Assessment of tunnel blasting-induced overbreak: A novel metaheuristic-based random forest approach

[...]

Biao He, Danial Jahed Armaghani, S. Lami

01 Mar 2023-Tunnelling and Underground Space Technology

TL;DR: In this paper , three hybrid random forest-based models (RF-GWO, RF-WOA, and RF-TSA) were constructed to predict overbreak in highway tunnels.

...read moreread less

18 citations

Journal Article•DOI•

Performance analysis of the water quality index model for predicting water state using machine learning techniques

[...]

Md. Galal Uddin¹•Institutions (1)

Jagannath University¹

01 Jan 2023-Chemical engineering research & design

TL;DR: In this article , four machine learning classifier algorithms, including support vector machines (SVM), Naïve Bayes (NB), random forest (RF), and gradient boosting (XGBoost), were utilized to identify the best classifier for predicting water quality classes using widely used seven WQI models, whereas three models are completely new and recently proposed by the authors.

...read moreread less

Abstract: Existing water quality index (WQI) models assess water quality using a range of classification schemes. Consequently, different methods provide a number of interpretations for the same water properties that contribute to a considerable amount of uncertainty in the correct classification of water quality. The aims of this study were to evaluate the performance of the water quality index (WQI) model in order to classify coastal water quality correctly using a completely new classification scheme. Cork Harbour water quality data was used in this study, which was collected by Ireland's environmental protection agency (EPA). In the present study, four machine-learning classifier algorithms, including support vector machines (SVM), Naïve Bayes (NB), random forest (RF), k-nearest neighbour (KNN), and gradient boosting (XGBoost), were utilized to identify the best classifier for predicting water quality classes using widely used seven WQI models, whereas three models are completely new and recently proposed by the authors. The KNN (100% correct and 0% wrong) and XGBoost (99.9% correct and 0.1% wrong) algorithms were outperformed in predicting the water quality accurately for seven WQI models. The model validation results indicate that the XGBoost classifier outperformed, including accuracy (1.0), precision (0.99), sensitivity (0.99), specificity (1.0), and F1 (0.99) score, in order to predict the correct classification of water quality. Moreover, compared to WQI models, higher prediction accuracy, precision, sensitivity, specificity, and F1 score were found for the weighted quadratic mean (WQM) and unweighted root mean square (RMS) WQI models, respectively, for each class. The findings of this study showed that the WQM and RMS models could be effective and reliable for assessing coastal water quality in terms of correct classification. Therefore, this study could be helpful in providing accurate water quality information to researchers, policymakers, and water research personnel for monitoring using the WQI model more effectively.

...read moreread less

16 citations

Journal Article•DOI•

Landslide Susceptibility mapping using random forest and extreme gradient boosting: A case study of Fengjie, Chongqing

[...]

Wengang Zhang, Yuwei He, Luqi Wang, Songlin Liu, Xuanyu Meng - Show less +1 more

07 Feb 2023-Geological Journal

TL;DR: In this paper , the evaluation effects of random forest (RF) and extreme gradient boosting (XGBoost) classifier models on landslide susceptibility, and to compare their applicability in Fengjie County, Chongqing, a typical landslideprone area in southwest of China.

...read moreread less

Abstract: Landslide susceptibility analysis can provide theoretical support for landslide risk management. However, some susceptibility analyses are not sufficiently interpretable. Moreover, the accuracy of many research methods needs to be improved. Therefore, this study can supplement these deficiencies. This study aims to research the evaluation effects of random forest (RF) and extreme gradient boosting (XGBoost) classifier models on landslide susceptibility, and to compare their applicability in Fengjie County, Chongqing, a typical landslide‐prone area in southwest of China. Firstly, 1624 landslides information from 1980 to 2020 were obtained through field investigation, and a geospatial database of 16 conditional factors had been constructed. Secondly, non‐landslide points were selected to form a complete data set and RF and XGBoost models were established. Finally, the area under the ROC curve (AUC) value, accuracy, and F‐score were used to compare the two models. The results show that even though both classifiers have a highly accurate evaluation of landslide susceptibility, the RF model performs better. In comparison, the RF model has a higher AUC value of 0.866, and its accuracy and F‐score are approximately 2% higher than XGBoost. The land use, elevation, and lithology of Fengjie County contribute to the occurrence of landslides. This is due to human engineering activities (such as land reclamation, and housing construction) resulting in low slope stability and landslides in widely distributed sandstone, siltstone, and mudstone layers owing to their low permeability and planes of weakness.

...read moreread less

15 citations

Journal Article•DOI•

Modified Genetic Algorithm with Deep Learning for Fraud Transactions of Ethereum Smart Contract

[...]

Pavan Kumar¹•Institutions (1)

Barkatullah University¹

04 Jan 2023-Applied Sciences

TL;DR: In this article , an Optimized Genetic Algorithm-Cuckoo Search (GA-CS) is combined with deep learning for detecting fraudulent transactions on Ethereum smart contracts, using a unique metaheuristic optimization strategy.

...read moreread less

Abstract: Recently, the Ethereum smart contracts have seen a surge in interest from the scientific community and new commercial uses. However, as online trade expands, other fraudulent practices—including phishing, bribery, and money laundering—emerge as significant challenges to trade security. This study is useful for reliably detecting fraudulent transactions; this work developed a deep learning model using a unique metaheuristic optimization strategy. The new optimization method to overcome the challenges, Optimized Genetic Algorithm-Cuckoo Search (GA-CS), is combined with deep learning. In this research, a Genetic Algorithm (GA) is used in the phase of exploration in the Cuckoo Search (CS) technique to address a deficiency in CS. A comprehensive experiment was conducted to appraise the efficiency and performance of the suggested strategies compared with those of various popular techniques, such as k-nearest neighbors (KNN), logistic regression (LR), multi-layer perceptron (MLP), XGBoost, light gradient boosting machine (LGBM), random forest (RF), and support vector classification (SVC), in terms of restricted features and we compared their performance and efficiency metrics to the suggested approach in detecting fraudulent behavior on Ethereum. The suggested technique and SVC models outperform the rest of the models, with the highest accuracy, while deep learning with the proposed optimization strategy outperforms the RF model, with slightly higher performance of 99.71% versus 98.33%.

...read moreread less

15 citations

Journal Article•DOI•

Schizophrenia classification using machine learning on resting state EEG signal

[...]

Juan Ruiz de Miras, Antonio J. Ibáñez-Molina, María Felipa Soriano, Sergio Iglesias-Parro

01 Jan 2023-Biomedical Signal Processing and Control

TL;DR: In this paper , the authors proposed a processing pipeline to obtain machine learning classifiers of schizophrenia based on resting state EEG data, and evaluated whether machine learning techniques can help in the diagnosis of schizophrenia.

...read moreread less

14 citations

Journal Article•DOI•

Limitations of receiver operating characteristic curve on imbalanced data: Assist device mortality risk scores

[...]

01 Apr 2023-The Journal of Thoracic and Cardiovascular Surgery

TL;DR: In this article , the authors compared the receiver operating characteristic and precision recall curve for two classifiers for 90-day left ventricular assist device mortality, HeartMate Risk Score and Random Forest for 800 patients (test group) recorded in the Interagency Registry for Mechanically Assisted Circulatory Support who received a continuous-flow left VAD device between 2006 and 2016 (mean age, 59 years; 146 female vs 654 male patients).

...read moreread less

13 citations

Journal Article•DOI•

A Clinical-Oriented Non-Severe Depression Diagnosis Method Based on Cognitive Behavior of Emotional Conflict

[...]

01 Feb 2023-IEEE Transactions on Computational Social Systems

TL;DR: Wang et al. as mentioned in this paper proposed a diagnosis method of non-severe depression based on cognitive behavior of emotional conflict, and four classifiers (nearest neighbor (KNN), support vector machine (SVM), kernel extreme learning machine (KELM), and random forest (RF)) were used to classify patients and normal subjects.

...read moreread less

Abstract: To improve the diagnosis accuracy of non-severe depression (NSD), this article proposes a diagnosis method of NSD based on cognitive behavior of emotional conflict. First, the original classification features are constructed based on the cognitive behavior of emotional conflict and statistical distribution, and a classification normalization method is proposed to preprocess the feature data. Then, the relief algorithm and principal component analysis (PCA) are recruited for feature processing. Finally, four classifiers [

$k$

-nearest neighbor (KNN), support vector machine (SVM), kernel extreme learning machine (KELM), and random forest (RF)] are used to classify NSD patients and normal subjects. The test results show that among all the classifiers, RF achieves the highest classification sensitivity and specificity of 92% and 88%, respectively. Compared with the results of other NSD diagnosis methods in recent years, it has a better performance. The diagnostic method for NSD proposed in this article has obvious performance advantages and provides technical support for improving the accuracy of clinical depression diagnosis. Furthermore, it also provides a new idea and method for the diagnosis and screening of depression.

...read moreread less

11 citations

Journal Article•DOI•

Intelligent transportation system for internet of vehicles based vehicular networks for smart cities

[...]

P Yavana Rani, Rohit Sharma

01 Jan 2023-Computers & Electrical Engineering

TL;DR: In this paper , an intelligent transport system for the IOVs-based vehicular network traffic for smart city scenario is proposed based on tree-based Decision Tree (DT), Random Forest (RF), and Extra Tree (ET), and XGBoost machine learning (ML) models.

...read moreread less

10 citations

Journal Article•DOI•

Survey of feature selection and extraction techniques for stock market prediction

[...]

Htet Htet Htun, M. Biehl, Nicolai Petkov

12 Jan 2023-Financial Innovation

TL;DR: In this paper , the authors present a detailed analysis of 32 research works that use a combination of feature study and ML approaches in various stock market applications and find that correlation criteria, random forest, principal component analysis, and autoencoder are the most widely used feature selection and extraction techniques with the best prediction accuracy for various stock markets applications.

...read moreread less

Abstract: In stock market forecasting, the identification of critical features that affect the performance of machine learning (ML) models is crucial to achieve accurate stock price predictions. Several review papers in the literature have focused on various ML, statistical, and deep learning-based methods used in stock market forecasting. However, no survey study has explored feature selection and extraction techniques for stock market forecasting. This survey presents a detailed analysis of 32 research works that use a combination of feature study and ML approaches in various stock market applications. We conduct a systematic search for articles in the Scopus and Web of Science databases for the years 2011-2022. We review a variety of feature selection and feature extraction approaches that have been successfully applied in the stock market analyses presented in the articles. We also describe the combination of feature analysis techniques and ML methods and evaluate their performance. Moreover, we present other survey articles, stock market input and output data, and analyses based on various factors. We find that correlation criteria, random forest, principal component analysis, and autoencoder are the most widely used feature selection and extraction techniques with the best prediction accuracy for various stock market applications.

...read moreread less

10 citations

Journal Article•DOI•

Prediction of operative mortality for patients undergoing cardiac surgical procedures without established risk scores

[...]

01 Apr 2023-The Journal of Thoracic and Cardiovascular Surgery

TL;DR: In this paper , four supervised machine learning models were trained using preoperative variables present in the Society of Thoracic Surgeons (STS) data set of the Massachusetts General Hospital to predict and classify operative mortality in procedures without STS risk scores.

...read moreread less

Journal Article•DOI•

Bitter-RF: A random forest machine model for recognizing bitter peptides

[...]

Yu Fei Zhang, Yu-Hao Wang, Zhimei Gu, Xianrun Pan, Jian Li, Hui Ding, Yan Zhang, Ke-Jun Deng - Show less +4 more

26 Jan 2023-Frontiers in Medicine

TL;DR: In this paper , a Random Forest (RF)-based model, called Bitter-RF, was developed for identifying bitter peptides. But, the model was not used to build a prediction model for the peptide.

...read moreread less

Abstract: Introduction Bitter peptides are short peptides with potential medical applications. The huge potential behind its bitter taste remains to be tapped. To better explore the value of bitter peptides in practice, we need a more effective classification method for identifying bitter peptides. Methods In this study, we developed a Random forest (RF)-based model, called Bitter-RF, using sequence information of the bitter peptide. Bitter-RF covers more comprehensive and extensive information by integrating 10 features extracted from the bitter peptides and achieves better results than the latest generation model on independent validation set. Results The proposed model can improve the accurate classification of bitter peptides (AUROC = 0.98 on independent set test) and enrich the practical application of RF method in protein classification tasks which has not been used to build a prediction model for bitter peptides. Discussion We hope the Bitter-RF could provide more conveniences to scholars for bitter peptide research.

...read moreread less

Journal Article•DOI•

Automatic COVID-19 prediction using explainable machine learning techniques

[...]

Sanzida Solayman, Sk. Azmiara Aumi, Chand Sultana Mery, Muktadir Mubassir, Riasat Khan - Show less +1 more

01 Jan 2023-International journal of cognitive computing in engineering

TL;DR: In this paper , the authors analyzed automatic COVID-19 detection using machine learning techniques to build an intelligent web application, where they used explainable AI with the LIME framework to interpret the prediction results.

...read moreread less

Abstract: The coronavirus is considered this century's most disruptive catastrophe and global concern. This disease has prompted extreme social, psychological and economic impacts affecting millions of people around the globe. COVID-19 is transmitted from one infected person's body to another through respiratory droplets. This virus proliferates when people breathe in air-contaminated space with droplets and microscopic airborne particles. This research aims to analyze automatic COVID-19 detection using machine learning techniques to build an intelligent web application. The dataset has been preprocessed by dropping null values, feature engineering, and synthetic oversampling (SMOTE) techniques. Next, we trained and evaluated different classifiers, i.e., logistic regression, random forest, decision tree, k-nearest neighbor, support vector machine (SVM), ensemble models (adaptive boosting and extreme gradient boosting) and deep learning (artificial neural network, convolutional neural network and long short-term memory) techniques. Explainable AI with the LIME framework has been applied to interpret the prediction results. The hybrid CNN-LSTM algorithm with the SMOTE approach performed better than the other models on the employed open-source dataset obtained from the Israeli Ministry of Health website, with 96.34% accuracy and a 0.98 F1 score. Finally, this model was chosen to deploy the proposed prediction system to a website, where users may acquire an instantaneous COVID-19 prognosis based on their symptoms.

...read moreread less

Journal Article•DOI•

A novel ensemble-based statistical approach to estimate daily wildfire-specific PM2.5 in California (2006–2020)

[...]

Olesya Slyzhuk¹•Institutions (1)

Scripps Institution of Oceanography¹

01 Jan 2023-Environment International

TL;DR: In this paper , an ensemble model that optimally combines multiple machine learning algorithms (including gradient boosting machine, random forest and deep learning) and a large set of explanatory variables to estimate daily PM2.5 concentrations at the ZIP code level, a relevant spatiotemporal resolution for epidemiological studies.

...read moreread less

Journal Article•DOI•

PMPTCE-HNEA: Predicting metabolic pathway types of chemicals and enzymes with a heterogeneous network embedding algorithm

[...]

Lei Chen, Hao Wang

24 Feb 2023-Current Bioinformatics

TL;DR: In this paper , a multi-label classifier is proposed to identify metabolic pathway types, reported in KEGG, of compounds and enzymes, and three heterogeneous networks are constructed.

...read moreread less

Abstract: Metabolic chemical reaction is one type of fundamental process to maintain life. Generally, each reaction needs an enzyme. The metabolic pathway collects a series of chemical reactions at the system level. As compounds and enzymes were two important components in each metabolic pathway, the identification of metabolic pathways in which a given compound or enzyme can participate is the first important step for understanding the mechanism of metabolic pathways. Metabolic chemical reaction is one type of fundamental processes to maintain life. Generally, each reaction needs an enzyme. Metabolic pathway collects a series of chemical reactions at the system level. As compounds and enzymes were two important components in each metabolic pathway, identification of metabolic pathways that a given compound or enzyme can participate in is the first important step for understanding the mechanism of metabolic pathways. The purpose of this study was to build efficient computational methods to predict the metabolic pathways of compounds and enzymes. The purpose of this study was to build efficient computational methods to predict metabolic pathways of compounds and enzymes. Novel multi-label classifiers are proposed to identify metabolic pathway types, reported in KEGG, of compounds and enzymes. Three heterogeneous networks defining compounds and enzymes as nodes are constructed. To extract more informative features of compounds and enzymes, we generalize the powerful network embedding algorithm, Mashup, to its heterogeneous network version, named MashupH. RAndom k-labELsets (RAKEL) is employed to build the classifiers and a support vector machine or random forest is selected as the base classification algorithm. The 10-fold cross-validation results indicate the good performance of the proposed classifiers and such performance is superior to the previous classifier that adopted features yielded by Mashup. They were also better than the classifiers using the traditional features of compounds and enzymes. Furthermore, some key parameters of MashupH that may give contributions or influence the classifiers are analyzed. The 10-fold cross-validation results indicate the good performance of the proposed classifiers and such performance is superior to the previous classifier that adopted features yielded by Mashup. Furthermore, some key parameters of MashupH that may give contributions or influences to the classifiers are analyzed. The features yielded by MashupH are more informative than those produced by Mashup on heterogeneous networks. This is the main reason why the new classifiers are superior to those using features yielded by Mashup. The features yielded by MashupH are more informative than those produced by Mashup on heterogeneous networks. This the main reason why the new classifiers are superior to those using features yielded by Mashup. /

...read moreread less

Journal Article•DOI•

Forecasting Bitcoin with technical analysis: A not-so-random forest?

[...]

01 Jan 2023-International Journal of Forecasting

TL;DR: In this article , the authors used data sampled at hourly and daily frequencies to predict Bitcoin returns and found that Bitcoin prices are weakly efficient at the hourly frequency and technical analysis combined with nonlinear forecasting models becomes statistically significantly dominant relative to the random walk model on a daily horizon.

...read moreread less

Journal Article•DOI•

House Prices Advanced Regression Techniques

[...]

Gadde Vinay Venkata Abhinav Kumar, Kanneganti Subba Rayudu, Gutta Ajay Kumar, D. Satish

28 Feb 2023-International Journal For Science Technology And Engineering

TL;DR: In this article , the most useful models and important criteria for predicting home values are examined in a literature review, and the adoption of Random Forest and XGBoost as the most effective models in comparison to others was confirmed by this study's findings.

...read moreread less

Abstract: Abstract: The real estate industry is seeing an increase in the use of data mining. The capacity of data mining to extricate helpful data from crude information makes it especially helpful for anticipating home estimations, essential housing characteristics, and a great many different elements. Homeowners and the real estate industry frequently feel anxious about price swings, according to research. The most useful models and important criteria for predicting home values are examined in a literature review. The adoption of Random Forest and XGBoost as the most effective models in comparison to others was confirmed by this study's findings. Additionally, our data suggest that locational and structural characteristics are significant forecasting variables for housing values. In order to identify the most effective machine learning model for conducting a study in this field and the most significant factors that influence home prices, this study will be very helpful, particularly to housing developers and academics.

...read moreread less

Journal Article•DOI•

Towards an Interpretable Autoencoder: A Decision-Tree-Based Autoencoder and its Application in Anomaly Detection

[...]

01 Mar 2023-IEEE Transactions on Dependable and Secure Computing

TL;DR: In this paper , the authors proposed the first interpretable autoencoder based on decision trees, which is designed to handle categorical data without the need to transform the data representation.

...read moreread less

Abstract: The importance of understanding and explaining the associated classification results in the utilization of artificial intelligence (AI) in many different practical applications (e.g., cyber security and forensics) has contributed to the trend of moving away from black-box / opaque AI towards explainable AI (XAI). In this article, we propose the first interpretable autoencoder based on decision trees, which is designed to handle categorical data without the need to transform the data representation. Furthermore, our proposed interpretable autoencoder provides a natural explanation for experts in the application area. The experimental findings show that our proposed interpretable autoencoder is among the top-ranked anomaly detection algorithms, along with one-class Support Vector Machine (SVM) and Gaussian Mixture. More specifically, our proposal is on average 2% below the best Area Under the Curve (AUC) result and 3% over the other Average Precision scores, in comparison to One-class SVM, Isolation Forest, Local Outlier Factor, Elliptic Envelope, Gaussian Mixture Model, and eForest.

...read moreread less

Journal Article•DOI•

Development of machine learning methods to predict the compressive strength of fiber-reinforced self-compacting concrete and sensitivity analysis

[...]

Hai-Van Thi Mai, M. H. Nguyen, Hai-Bang Ly

01 Feb 2023-Construction and Building Materials

TL;DR: In this paper , three machine learning models, including decision tree, light gradient boosting machine, and extreme gradient boosting (XGBoost), were evaluated for predicting the compressive strength of fiber reinforced self-compact concrete (FRSCC).

...read moreread less

Journal Article•DOI•

A hybrid super ensemble learning model for the early-stage prediction of diabetes risk

[...]

Ayşenur Doğru, Selim Buyrukoglu, Murat Arı

05 Jan 2023-Medical & Biological Engineering & Computing

Journal Article•DOI•

Predicting Water Quality Index (WQI) by feature selection and machine learning: A case study of An Kim Hai irrigation system

[...]

Bui Quoc Lap, Thi Anh Thu Phan, Huu D. Nguyen, Lê Xuân Quang, Phi Thi Hang, Nguyen Quang Phi, V. Truong Hoang, Pham Gia Linh, Bui Thanh Hang - Show less +5 more

01 May 2023-Ecological Informatics

TL;DR: In this paper , the authors investigated the performance of the ML-based method in calculating the WQI and applied several feature selection techniques to select the key parameters fed the ML models.

...read moreread less

Journal Article•DOI•

An Intrusion Detection System for SDN Using Machine Learning

[...]

01 Jan 2023

TL;DR: In this article , a hybrid feature selection algorithm consisting of two phases is applied to reduce the data dimension and to obtain an optimal feature subset, which is then used to detect and classify different types of attacks.

...read moreread less

Abstract: Software Defined Networking (SDN) has emerged as a promising and exciting option for the future growth of the internet. SDN has increased the flexibility and transparency of the managed, centralized, and controlled network. On the other hand, these advantages create a more vulnerable environment with substantial risks, culminating in network difficulties, system paralysis, online banking frauds, and robberies. These issues have a significant detrimental impact on organizations, enterprises, and even economies. Accuracy, high performance, and real-time systems are necessary to achieve this goal. Using a SDN to extend intelligent machine learning methodologies in an Intrusion Detection System (IDS) has stimulated the interest of numerous research investigators over the last decade. In this paper, a novel HFS-LGBM IDS is proposed for SDN. First, the Hybrid Feature Selection algorithm consisting of two phases is applied to reduce the data dimension and to obtain an optimal feature subset. In the first phase, the Correlation based Feature Selection (CFS) algorithm is used to obtain the feature subset. The optimal feature set is obtained by applying the Random Forest Recursive Feature Elimination (RF-RFE) in the second phase. A LightGBM algorithm is then used to detect and classify different types of attacks. The experimental results based on NSL-KDD dataset show that the proposed system produces outstanding results compared to the existing methods in terms of accuracy, precision, recall and f-measure.

...read moreread less

Journal Article•DOI•

Fault Prediction Recommender Model for IoT Enabled Sensors Based Workplace

[...]

Mudita Uppal, Deepali Gupta, Amena Mahmoud, M.A. Elmagzoub, Adel Sulaiman, Mana Saleh Al Reshan, Asadullah Shaikh - Show less +3 more

06 Jan 2023-Sustainability

TL;DR: In this paper , the authors proposed a fault prediction recommender model for real-time health monitoring of IoT devices through an ML algorithm to make devices more efficient and increase the quality of life.

...read moreread less

Abstract: Industry 5.0 benefits from advancements being made in the field of machine learning and the Internet of Things. Different sensors have been installed in a variety of IoT devices present in different industries such as transportation, healthcare, manufacturing, agriculture, etc. The sensors present in these devices should automatically predict errors due to the extensive use of sensors in urban living. To ensure the integrity, precision, security, dependability and fidelity of sensor nodes, it is, therefore, necessary to foresee faults before they occur. Additionally, as more data is being collected by these devices every day, cloud computing becomes more necessary for sustainable urban living. The proposed model emphasizes solution recommendations for faults that occurred in real-life smart devices to mitigate faults at an early stage, which is a key requirement in today’s smart offices. The proposed model monitors the real-time health of IoT devices through an ML algorithm to make devices more efficient and increase the quality of life. Through the use of K-Nearest Neighbor, Decision Tree, Gaussian Naive Bayes and Random Forest approach, the proposed fault prediction recommender model has been evaluated and Random Forest shows the highest accuracy compared to other classifiers. Several performance indicators such as recall, accuracy, F1 score and precision were utilized to examine the performance of the model. The results have demonstrated the effectiveness of ML techniques applied to sensors in predicting faults in smart offices with Random Forest being observed as the best technique with a maximum accuracy of 94.27%. In future, deep learning can also be applied to bigger datasets to provide more accurate results.

...read moreread less

Journal Article•DOI•

ML‐IDSDN: Machine learning based intrusion detection system for software‐defined network

[...]

Abdulsalam O. Alzahrani, Mohammed J. F. Alenazi

11 Nov 2022-Concurrency and Computation: Practice and Experience

TL;DR: In this paper , two created datasets generated from SDN using Mininet and Ryu controller with different feature extraction tools were used for training a number of supervised binary classification machine learning algorithms such as kNN, AdaBoost, decision tree (DT), random forest, naive Bayes, multilayer perceptron, support vector machine, and XGBoost.

...read moreread less

Abstract: Software‐defined networking (SDN) has been developed to separate network control plane from forwarding plane which can decrease operational costs and the time it takes to deploy new services compared to traditional networks. Despite these advantages, this technology brings threats and vulnerabilities. Consequently, developing high‐performance real‐time intrusion detection systems (IDSs) to classify malicious activities is a vital part of SDN architecture. This article introduces two created datasets generated from SDN using Mininet and Ryu controller with different feature extraction tools that contain normal traffic and different types of attacks (Fin flood, UDP flood, ICMP flood, OS probe scan, port probe scan, TCP bandwidth flood, and TCP syn flood) that is used for training a number of supervised binary classification machine learning algorithms such as k‐nearest neighbor, AdaBoost, decision tree (DT), random forest, naive Bayes, multilayer perceptron, support vector machine, and XGBoost. The DT algorithm has achieved high scores to fit a real‐time application achieving F1 score on attack class of 0.9995, F1 score on normal class of 0.9983, and throughput score of 6,737,147.275 samples per second with a total number of three features. In addition, using data preprocessing to reduce the model complexity, thereby increasing the overall throughput to fit a real‐time system.

...read moreread less

Journal Article•DOI•

Application of machine learning approaches for land cover monitoring in northern Cameroon

[...]

Yisa Ginath Yuh, Wiktor Tracz, H. Damon Matthews, Sarah Turner

01 May 2023-Ecological Informatics

TL;DR: In this article , the authors compared the performance of four ML algorithms (kNN, SVM, ANN and RF) applied to LULC monitoring within the Mayo Rey department, North Province, Cameroon.

...read moreread less

Journal Article•DOI•

Diagnosis of Schizophrenia: A Comprehensive Evaluation

[...]

01 Mar 2023-IEEE Journal of Biomedical and Health Informatics

TL;DR: In this article , the authors evaluated the performance of classification models along with different feature selection approaches on the structural magnetic resonance imaging data and showed that proper selection of the features and the classification models can improve the diagnosis of Schizophrenia.

...read moreread less

Abstract: Machine learning models have been successfully employed in the diagnosis of Schizophrenia disease. The impact of classification models and the feature selection techniques on the diagnosis of Schizophrenia have not been evaluated. Here, we sought to access the performance of classification models along with different feature selection approaches on the structural magnetic resonance imaging data. The data consist of 72 subjects with Schizophrenia and 74 healthy control subjects. We evaluated different classification algorithms based on support vector machine (SVM), random forest, kernel ridge regression and randomized neural networks. Moreover, we evaluated T-Test, Receiver Operator Characteristics (ROC), Wilcoxon, entropy, Bhattacharyya, Minimum Redundancy Maximum Relevance (MRMR) and Neighbourhood Component Analysis (NCA) as the feature selection techniques. Based on the evaluation, SVM based models with Gaussian kernel proved better compared to other classification models and Wilcoxon feature selection emerged as the best feature selection approach. Moreover, in terms of data modality the performance on integration of the grey matter and white matter proved better compared to the performance on the grey and white matter individually. Our evaluation showed that classification algorithms along with the feature selection approaches impact the diagnosis of Schizophrenia disease. This indicates that proper selection of the features and the classification models can improve the diagnosis of Schizophrenia.

...read moreread less

Journal Article•DOI•

Prediction of forest fire susceptibility applying machine and deep learning algorithms for conservation priorities of forest resources

[...]

Soumik Saha, Biswajit Bera, Pravat Kumar Shit, Suman Bhattacharjee, Nairita Sengupta - Show less +1 more

01 Jan 2023-Remote Sensing Applications: Society and Environment

TL;DR: In this paper , three machine learning models (i.e., Random Forest (RF), Multivariate Adaptive Regression Splines (MARS) and Deep Learning Neural Network (DLNN) have been used to demarcate the accurate forest fire susceptibility zones.

...read moreread less

Journal Article•DOI•

Real-Time State of Charge Estimation of Lithium-Ion Batteries Using Optimized Random Forest Regression Algorithm

[...]

01 Jan 2023-IEEE transactions on intelligent vehicles

TL;DR: In this paper , an improved machine learning approach for the accurate and robust state of charge (SOC) in electric vehicle (EV) batteries using differential search optimized random forest regression (RFR) algorithm is presented.

...read moreread less

Abstract: This paper presents an improved machine learning approach for the accurate and robust state of charge (SOC) in electric vehicle (EV) batteries using differential search optimized random forest regression (RFR) algorithm. The precise SOC estimation confirms the safety and reliability of EV. Nevertheless, SOC is influenced by numerous factors which cannot be measured directly. RFR is suitable for real-time SOC estimation due to its robustness to noise, overfitting issues and capacity to work with huge datasets. However, proper selection of RFR architecture and hyper-parameters combination remains a key issue to be explored. Hence, a differential search algorithm (DSA) is employed to search for the optimal values of trees and leaves in the RFR algorithm. DSA optimized RFR eliminates the utilization of the filter in data pre-processing steps and does not require a detailed understanding and knowledge about battery chemistry, rather only needs sensors to monitor battery voltage and current. The developed approach is validated at room temperature using two types of lithium-ion batteries under a pulse discharge test. In addition, the proposed model is verified under varying temperature settings under EV drive cycles. The experimental results demonstrate that the DSA optimized RFR algorithm achieves RMSE of 0.382% in the HPPC test using LiNMC battery. Besides, the proposed method obtains satisfactory outcomes in EV drive cycles, estimating MAE of 0.193% and 0.346% in DST and FUDS cycles, respectively, at 25°C.

...read moreread less

Journal Article•DOI•

Machine learning-based seismic fragility and seismic vulnerability assessment of reinforced concrete structures

[...]

Farzin Kazemi, Neda Asgarkhani, Robert Jankowski

01 Mar 2023-Soil Dynamics and Earthquake Engineering

TL;DR: Wang et al. as mentioned in this paper used Machine Learning (ML) algorithms by innovative methods of hyperparameter optimization, such as halving search, grid search, random search, and the k-fold cross-validation, to derive the seismic fragility curve for accelerating seismic risk assessment.

...read moreread less

Journal Article•DOI•

Predicting the carbon dioxide emission caused by road transport using a Random Forest (RF) model combined by Meta-Heuristic algorithms

[...]

Hamed Khajavi, Amir Rastgoo

01 Mar 2023-Sustainable Cities and Society

TL;DR: In this paper , a hybrid Random Forest, support vector regression, and response surface methodology are implemented to predict CO2 emission in 30 major cities in China, and seven optimizers are applied to the random forest, and two optimizers were applied to SVM to tune their hyper-parameters.

...read moreread less

Collapse