scispace - formally typeset
Search or ask a question

Showing papers on "Random forest published in 2023"


Journal ArticleDOI
TL;DR: A random forest is used to identify the best forecasting method using only time series features and is shown to yield accurate forecasts comparable to several benchmarks and other commonly used automated approaches of time series forecasting.
Abstract: A crucial task in time series forecasting is the identification of the most suitable forecasting method. We present a general framework for forecast-model selection using meta-learning. A random forest is used to identify the best forecasting method using only time series features. The framework is evaluated using time series from the M1 and M3 competitions and is shown to yield accurate forecasts comparable to several benchmarks and other commonly used automated approaches of time series forecasting. A key advantage of our proposed framework is that the time-consuming process of building a classifier is handled in advance of the forecasting task at hand.

47 citations


Journal ArticleDOI
TL;DR: In this paper , three hybrid random forest-based models (RF-GWO, RF-WOA, and RF-TSA) were constructed to predict overbreak in highway tunnels.

18 citations


Journal ArticleDOI
TL;DR: In this article , four machine learning classifier algorithms, including support vector machines (SVM), Naïve Bayes (NB), random forest (RF), and gradient boosting (XGBoost), were utilized to identify the best classifier for predicting water quality classes using widely used seven WQI models, whereas three models are completely new and recently proposed by the authors.
Abstract: Existing water quality index (WQI) models assess water quality using a range of classification schemes. Consequently, different methods provide a number of interpretations for the same water properties that contribute to a considerable amount of uncertainty in the correct classification of water quality. The aims of this study were to evaluate the performance of the water quality index (WQI) model in order to classify coastal water quality correctly using a completely new classification scheme. Cork Harbour water quality data was used in this study, which was collected by Ireland's environmental protection agency (EPA). In the present study, four machine-learning classifier algorithms, including support vector machines (SVM), Naïve Bayes (NB), random forest (RF), k-nearest neighbour (KNN), and gradient boosting (XGBoost), were utilized to identify the best classifier for predicting water quality classes using widely used seven WQI models, whereas three models are completely new and recently proposed by the authors. The KNN (100% correct and 0% wrong) and XGBoost (99.9% correct and 0.1% wrong) algorithms were outperformed in predicting the water quality accurately for seven WQI models. The model validation results indicate that the XGBoost classifier outperformed, including accuracy (1.0), precision (0.99), sensitivity (0.99), specificity (1.0), and F1 (0.99) score, in order to predict the correct classification of water quality. Moreover, compared to WQI models, higher prediction accuracy, precision, sensitivity, specificity, and F1 score were found for the weighted quadratic mean (WQM) and unweighted root mean square (RMS) WQI models, respectively, for each class. The findings of this study showed that the WQM and RMS models could be effective and reliable for assessing coastal water quality in terms of correct classification. Therefore, this study could be helpful in providing accurate water quality information to researchers, policymakers, and water research personnel for monitoring using the WQI model more effectively.

16 citations


Journal ArticleDOI
TL;DR: In this paper , the evaluation effects of random forest (RF) and extreme gradient boosting (XGBoost) classifier models on landslide susceptibility, and to compare their applicability in Fengjie County, Chongqing, a typical landslideprone area in southwest of China.
Abstract: Landslide susceptibility analysis can provide theoretical support for landslide risk management. However, some susceptibility analyses are not sufficiently interpretable. Moreover, the accuracy of many research methods needs to be improved. Therefore, this study can supplement these deficiencies. This study aims to research the evaluation effects of random forest (RF) and extreme gradient boosting (XGBoost) classifier models on landslide susceptibility, and to compare their applicability in Fengjie County, Chongqing, a typical landslide‐prone area in southwest of China. Firstly, 1624 landslides information from 1980 to 2020 were obtained through field investigation, and a geospatial database of 16 conditional factors had been constructed. Secondly, non‐landslide points were selected to form a complete data set and RF and XGBoost models were established. Finally, the area under the ROC curve (AUC) value, accuracy, and F‐score were used to compare the two models. The results show that even though both classifiers have a highly accurate evaluation of landslide susceptibility, the RF model performs better. In comparison, the RF model has a higher AUC value of 0.866, and its accuracy and F‐score are approximately 2% higher than XGBoost. The land use, elevation, and lithology of Fengjie County contribute to the occurrence of landslides. This is due to human engineering activities (such as land reclamation, and housing construction) resulting in low slope stability and landslides in widely distributed sandstone, siltstone, and mudstone layers owing to their low permeability and planes of weakness.

15 citations


Journal ArticleDOI
TL;DR: In this article , an Optimized Genetic Algorithm-Cuckoo Search (GA-CS) is combined with deep learning for detecting fraudulent transactions on Ethereum smart contracts, using a unique metaheuristic optimization strategy.
Abstract: Recently, the Ethereum smart contracts have seen a surge in interest from the scientific community and new commercial uses. However, as online trade expands, other fraudulent practices—including phishing, bribery, and money laundering—emerge as significant challenges to trade security. This study is useful for reliably detecting fraudulent transactions; this work developed a deep learning model using a unique metaheuristic optimization strategy. The new optimization method to overcome the challenges, Optimized Genetic Algorithm-Cuckoo Search (GA-CS), is combined with deep learning. In this research, a Genetic Algorithm (GA) is used in the phase of exploration in the Cuckoo Search (CS) technique to address a deficiency in CS. A comprehensive experiment was conducted to appraise the efficiency and performance of the suggested strategies compared with those of various popular techniques, such as k-nearest neighbors (KNN), logistic regression (LR), multi-layer perceptron (MLP), XGBoost, light gradient boosting machine (LGBM), random forest (RF), and support vector classification (SVC), in terms of restricted features and we compared their performance and efficiency metrics to the suggested approach in detecting fraudulent behavior on Ethereum. The suggested technique and SVC models outperform the rest of the models, with the highest accuracy, while deep learning with the proposed optimization strategy outperforms the RF model, with slightly higher performance of 99.71% versus 98.33%.

15 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed a processing pipeline to obtain machine learning classifiers of schizophrenia based on resting state EEG data, and evaluated whether machine learning techniques can help in the diagnosis of schizophrenia.

14 citations


Journal ArticleDOI
TL;DR: In this article , the authors compared the receiver operating characteristic and precision recall curve for two classifiers for 90-day left ventricular assist device mortality, HeartMate Risk Score and Random Forest for 800 patients (test group) recorded in the Interagency Registry for Mechanically Assisted Circulatory Support who received a continuous-flow left VAD device between 2006 and 2016 (mean age, 59 years; 146 female vs 654 male patients).

13 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a diagnosis method of non-severe depression based on cognitive behavior of emotional conflict, and four classifiers (nearest neighbor (KNN), support vector machine (SVM), kernel extreme learning machine (KELM), and random forest (RF)) were used to classify patients and normal subjects.
Abstract: To improve the diagnosis accuracy of non-severe depression (NSD), this article proposes a diagnosis method of NSD based on cognitive behavior of emotional conflict. First, the original classification features are constructed based on the cognitive behavior of emotional conflict and statistical distribution, and a classification normalization method is proposed to preprocess the feature data. Then, the relief algorithm and principal component analysis (PCA) are recruited for feature processing. Finally, four classifiers [ $k$ -nearest neighbor (KNN), support vector machine (SVM), kernel extreme learning machine (KELM), and random forest (RF)] are used to classify NSD patients and normal subjects. The test results show that among all the classifiers, RF achieves the highest classification sensitivity and specificity of 92% and 88%, respectively. Compared with the results of other NSD diagnosis methods in recent years, it has a better performance. The diagnostic method for NSD proposed in this article has obvious performance advantages and provides technical support for improving the accuracy of clinical depression diagnosis. Furthermore, it also provides a new idea and method for the diagnosis and screening of depression.

11 citations


Journal ArticleDOI
TL;DR: In this paper , an intelligent transport system for the IOVs-based vehicular network traffic for smart city scenario is proposed based on tree-based Decision Tree (DT), Random Forest (RF), and Extra Tree (ET), and XGBoost machine learning (ML) models.

10 citations


Journal ArticleDOI
TL;DR: In this paper , the authors present a detailed analysis of 32 research works that use a combination of feature study and ML approaches in various stock market applications and find that correlation criteria, random forest, principal component analysis, and autoencoder are the most widely used feature selection and extraction techniques with the best prediction accuracy for various stock markets applications.
Abstract: In stock market forecasting, the identification of critical features that affect the performance of machine learning (ML) models is crucial to achieve accurate stock price predictions. Several review papers in the literature have focused on various ML, statistical, and deep learning-based methods used in stock market forecasting. However, no survey study has explored feature selection and extraction techniques for stock market forecasting. This survey presents a detailed analysis of 32 research works that use a combination of feature study and ML approaches in various stock market applications. We conduct a systematic search for articles in the Scopus and Web of Science databases for the years 2011-2022. We review a variety of feature selection and feature extraction approaches that have been successfully applied in the stock market analyses presented in the articles. We also describe the combination of feature analysis techniques and ML methods and evaluate their performance. Moreover, we present other survey articles, stock market input and output data, and analyses based on various factors. We find that correlation criteria, random forest, principal component analysis, and autoencoder are the most widely used feature selection and extraction techniques with the best prediction accuracy for various stock market applications.

10 citations


Journal ArticleDOI
TL;DR: In this paper , four supervised machine learning models were trained using preoperative variables present in the Society of Thoracic Surgeons (STS) data set of the Massachusetts General Hospital to predict and classify operative mortality in procedures without STS risk scores.

Journal ArticleDOI
TL;DR: In this paper , a Random Forest (RF)-based model, called Bitter-RF, was developed for identifying bitter peptides. But, the model was not used to build a prediction model for the peptide.
Abstract: Introduction Bitter peptides are short peptides with potential medical applications. The huge potential behind its bitter taste remains to be tapped. To better explore the value of bitter peptides in practice, we need a more effective classification method for identifying bitter peptides. Methods In this study, we developed a Random forest (RF)-based model, called Bitter-RF, using sequence information of the bitter peptide. Bitter-RF covers more comprehensive and extensive information by integrating 10 features extracted from the bitter peptides and achieves better results than the latest generation model on independent validation set. Results The proposed model can improve the accurate classification of bitter peptides (AUROC = 0.98 on independent set test) and enrich the practical application of RF method in protein classification tasks which has not been used to build a prediction model for bitter peptides. Discussion We hope the Bitter-RF could provide more conveniences to scholars for bitter peptide research.

Journal ArticleDOI
TL;DR: In this paper , the authors analyzed automatic COVID-19 detection using machine learning techniques to build an intelligent web application, where they used explainable AI with the LIME framework to interpret the prediction results.
Abstract: The coronavirus is considered this century's most disruptive catastrophe and global concern. This disease has prompted extreme social, psychological and economic impacts affecting millions of people around the globe. COVID-19 is transmitted from one infected person's body to another through respiratory droplets. This virus proliferates when people breathe in air-contaminated space with droplets and microscopic airborne particles. This research aims to analyze automatic COVID-19 detection using machine learning techniques to build an intelligent web application. The dataset has been preprocessed by dropping null values, feature engineering, and synthetic oversampling (SMOTE) techniques. Next, we trained and evaluated different classifiers, i.e., logistic regression, random forest, decision tree, k-nearest neighbor, support vector machine (SVM), ensemble models (adaptive boosting and extreme gradient boosting) and deep learning (artificial neural network, convolutional neural network and long short-term memory) techniques. Explainable AI with the LIME framework has been applied to interpret the prediction results. The hybrid CNN-LSTM algorithm with the SMOTE approach performed better than the other models on the employed open-source dataset obtained from the Israeli Ministry of Health website, with 96.34% accuracy and a 0.98 F1 score. Finally, this model was chosen to deploy the proposed prediction system to a website, where users may acquire an instantaneous COVID-19 prognosis based on their symptoms.

Journal ArticleDOI
TL;DR: In this paper , an ensemble model that optimally combines multiple machine learning algorithms (including gradient boosting machine, random forest and deep learning) and a large set of explanatory variables to estimate daily PM2.5 concentrations at the ZIP code level, a relevant spatiotemporal resolution for epidemiological studies.

Journal ArticleDOI
TL;DR: In this paper , a multi-label classifier is proposed to identify metabolic pathway types, reported in KEGG, of compounds and enzymes, and three heterogeneous networks are constructed.
Abstract: Metabolic chemical reaction is one type of fundamental process to maintain life. Generally, each reaction needs an enzyme. The metabolic pathway collects a series of chemical reactions at the system level. As compounds and enzymes were two important components in each metabolic pathway, the identification of metabolic pathways in which a given compound or enzyme can participate is the first important step for understanding the mechanism of metabolic pathways. Metabolic chemical reaction is one type of fundamental processes to maintain life. Generally, each reaction needs an enzyme. Metabolic pathway collects a series of chemical reactions at the system level. As compounds and enzymes were two important components in each metabolic pathway, identification of metabolic pathways that a given compound or enzyme can participate in is the first important step for understanding the mechanism of metabolic pathways. The purpose of this study was to build efficient computational methods to predict the metabolic pathways of compounds and enzymes. The purpose of this study was to build efficient computational methods to predict metabolic pathways of compounds and enzymes. Novel multi-label classifiers are proposed to identify metabolic pathway types, reported in KEGG, of compounds and enzymes. Three heterogeneous networks defining compounds and enzymes as nodes are constructed. To extract more informative features of compounds and enzymes, we generalize the powerful network embedding algorithm, Mashup, to its heterogeneous network version, named MashupH. RAndom k-labELsets (RAKEL) is employed to build the classifiers and a support vector machine or random forest is selected as the base classification algorithm. The 10-fold cross-validation results indicate the good performance of the proposed classifiers and such performance is superior to the previous classifier that adopted features yielded by Mashup. They were also better than the classifiers using the traditional features of compounds and enzymes. Furthermore, some key parameters of MashupH that may give contributions or influence the classifiers are analyzed. The 10-fold cross-validation results indicate the good performance of the proposed classifiers and such performance is superior to the previous classifier that adopted features yielded by Mashup. Furthermore, some key parameters of MashupH that may give contributions or influences to the classifiers are analyzed. The features yielded by MashupH are more informative than those produced by Mashup on heterogeneous networks. This is the main reason why the new classifiers are superior to those using features yielded by Mashup. The features yielded by MashupH are more informative than those produced by Mashup on heterogeneous networks. This the main reason why the new classifiers are superior to those using features yielded by Mashup. /

Journal ArticleDOI
TL;DR: In this article , the authors used data sampled at hourly and daily frequencies to predict Bitcoin returns and found that Bitcoin prices are weakly efficient at the hourly frequency and technical analysis combined with nonlinear forecasting models becomes statistically significantly dominant relative to the random walk model on a daily horizon.

Journal ArticleDOI
TL;DR: In this article , the most useful models and important criteria for predicting home values are examined in a literature review, and the adoption of Random Forest and XGBoost as the most effective models in comparison to others was confirmed by this study's findings.
Abstract: Abstract: The real estate industry is seeing an increase in the use of data mining. The capacity of data mining to extricate helpful data from crude information makes it especially helpful for anticipating home estimations, essential housing characteristics, and a great many different elements. Homeowners and the real estate industry frequently feel anxious about price swings, according to research. The most useful models and important criteria for predicting home values are examined in a literature review. The adoption of Random Forest and XGBoost as the most effective models in comparison to others was confirmed by this study's findings. Additionally, our data suggest that locational and structural characteristics are significant forecasting variables for housing values. In order to identify the most effective machine learning model for conducting a study in this field and the most significant factors that influence home prices, this study will be very helpful, particularly to housing developers and academics.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed the first interpretable autoencoder based on decision trees, which is designed to handle categorical data without the need to transform the data representation.
Abstract: The importance of understanding and explaining the associated classification results in the utilization of artificial intelligence (AI) in many different practical applications (e.g., cyber security and forensics) has contributed to the trend of moving away from black-box / opaque AI towards explainable AI (XAI). In this article, we propose the first interpretable autoencoder based on decision trees, which is designed to handle categorical data without the need to transform the data representation. Furthermore, our proposed interpretable autoencoder provides a natural explanation for experts in the application area. The experimental findings show that our proposed interpretable autoencoder is among the top-ranked anomaly detection algorithms, along with one-class Support Vector Machine (SVM) and Gaussian Mixture. More specifically, our proposal is on average 2% below the best Area Under the Curve (AUC) result and 3% over the other Average Precision scores, in comparison to One-class SVM, Isolation Forest, Local Outlier Factor, Elliptic Envelope, Gaussian Mixture Model, and eForest.

Journal ArticleDOI
TL;DR: In this paper , three machine learning models, including decision tree, light gradient boosting machine, and extreme gradient boosting (XGBoost), were evaluated for predicting the compressive strength of fiber reinforced self-compact concrete (FRSCC).


Journal ArticleDOI
TL;DR: In this paper , the authors investigated the performance of the ML-based method in calculating the WQI and applied several feature selection techniques to select the key parameters fed the ML models.

Journal ArticleDOI
01 Jan 2023
TL;DR: In this article , a hybrid feature selection algorithm consisting of two phases is applied to reduce the data dimension and to obtain an optimal feature subset, which is then used to detect and classify different types of attacks.
Abstract: Software Defined Networking (SDN) has emerged as a promising and exciting option for the future growth of the internet. SDN has increased the flexibility and transparency of the managed, centralized, and controlled network. On the other hand, these advantages create a more vulnerable environment with substantial risks, culminating in network difficulties, system paralysis, online banking frauds, and robberies. These issues have a significant detrimental impact on organizations, enterprises, and even economies. Accuracy, high performance, and real-time systems are necessary to achieve this goal. Using a SDN to extend intelligent machine learning methodologies in an Intrusion Detection System (IDS) has stimulated the interest of numerous research investigators over the last decade. In this paper, a novel HFS-LGBM IDS is proposed for SDN. First, the Hybrid Feature Selection algorithm consisting of two phases is applied to reduce the data dimension and to obtain an optimal feature subset. In the first phase, the Correlation based Feature Selection (CFS) algorithm is used to obtain the feature subset. The optimal feature set is obtained by applying the Random Forest Recursive Feature Elimination (RF-RFE) in the second phase. A LightGBM algorithm is then used to detect and classify different types of attacks. The experimental results based on NSL-KDD dataset show that the proposed system produces outstanding results compared to the existing methods in terms of accuracy, precision, recall and f-measure.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed a fault prediction recommender model for real-time health monitoring of IoT devices through an ML algorithm to make devices more efficient and increase the quality of life.
Abstract: Industry 5.0 benefits from advancements being made in the field of machine learning and the Internet of Things. Different sensors have been installed in a variety of IoT devices present in different industries such as transportation, healthcare, manufacturing, agriculture, etc. The sensors present in these devices should automatically predict errors due to the extensive use of sensors in urban living. To ensure the integrity, precision, security, dependability and fidelity of sensor nodes, it is, therefore, necessary to foresee faults before they occur. Additionally, as more data is being collected by these devices every day, cloud computing becomes more necessary for sustainable urban living. The proposed model emphasizes solution recommendations for faults that occurred in real-life smart devices to mitigate faults at an early stage, which is a key requirement in today’s smart offices. The proposed model monitors the real-time health of IoT devices through an ML algorithm to make devices more efficient and increase the quality of life. Through the use of K-Nearest Neighbor, Decision Tree, Gaussian Naive Bayes and Random Forest approach, the proposed fault prediction recommender model has been evaluated and Random Forest shows the highest accuracy compared to other classifiers. Several performance indicators such as recall, accuracy, F1 score and precision were utilized to examine the performance of the model. The results have demonstrated the effectiveness of ML techniques applied to sensors in predicting faults in smart offices with Random Forest being observed as the best technique with a maximum accuracy of 94.27%. In future, deep learning can also be applied to bigger datasets to provide more accurate results.

Journal ArticleDOI
TL;DR: In this paper , two created datasets generated from SDN using Mininet and Ryu controller with different feature extraction tools were used for training a number of supervised binary classification machine learning algorithms such as kNN, AdaBoost, decision tree (DT), random forest, naive Bayes, multilayer perceptron, support vector machine, and XGBoost.
Abstract: Software‐defined networking (SDN) has been developed to separate network control plane from forwarding plane which can decrease operational costs and the time it takes to deploy new services compared to traditional networks. Despite these advantages, this technology brings threats and vulnerabilities. Consequently, developing high‐performance real‐time intrusion detection systems (IDSs) to classify malicious activities is a vital part of SDN architecture. This article introduces two created datasets generated from SDN using Mininet and Ryu controller with different feature extraction tools that contain normal traffic and different types of attacks (Fin flood, UDP flood, ICMP flood, OS probe scan, port probe scan, TCP bandwidth flood, and TCP syn flood) that is used for training a number of supervised binary classification machine learning algorithms such as k‐nearest neighbor, AdaBoost, decision tree (DT), random forest, naive Bayes, multilayer perceptron, support vector machine, and XGBoost. The DT algorithm has achieved high scores to fit a real‐time application achieving F1 score on attack class of 0.9995, F1 score on normal class of 0.9983, and throughput score of 6,737,147.275 samples per second with a total number of three features. In addition, using data preprocessing to reduce the model complexity, thereby increasing the overall throughput to fit a real‐time system.

Journal ArticleDOI
TL;DR: In this article , the authors compared the performance of four ML algorithms (kNN, SVM, ANN and RF) applied to LULC monitoring within the Mayo Rey department, North Province, Cameroon.

Journal ArticleDOI
TL;DR: In this article , the authors evaluated the performance of classification models along with different feature selection approaches on the structural magnetic resonance imaging data and showed that proper selection of the features and the classification models can improve the diagnosis of Schizophrenia.
Abstract: Machine learning models have been successfully employed in the diagnosis of Schizophrenia disease. The impact of classification models and the feature selection techniques on the diagnosis of Schizophrenia have not been evaluated. Here, we sought to access the performance of classification models along with different feature selection approaches on the structural magnetic resonance imaging data. The data consist of 72 subjects with Schizophrenia and 74 healthy control subjects. We evaluated different classification algorithms based on support vector machine (SVM), random forest, kernel ridge regression and randomized neural networks. Moreover, we evaluated T-Test, Receiver Operator Characteristics (ROC), Wilcoxon, entropy, Bhattacharyya, Minimum Redundancy Maximum Relevance (MRMR) and Neighbourhood Component Analysis (NCA) as the feature selection techniques. Based on the evaluation, SVM based models with Gaussian kernel proved better compared to other classification models and Wilcoxon feature selection emerged as the best feature selection approach. Moreover, in terms of data modality the performance on integration of the grey matter and white matter proved better compared to the performance on the grey and white matter individually. Our evaluation showed that classification algorithms along with the feature selection approaches impact the diagnosis of Schizophrenia disease. This indicates that proper selection of the features and the classification models can improve the diagnosis of Schizophrenia.

Journal ArticleDOI
TL;DR: In this paper , three machine learning models (i.e., Random Forest (RF), Multivariate Adaptive Regression Splines (MARS) and Deep Learning Neural Network (DLNN) have been used to demarcate the accurate forest fire susceptibility zones.

Journal ArticleDOI
TL;DR: In this paper , an improved machine learning approach for the accurate and robust state of charge (SOC) in electric vehicle (EV) batteries using differential search optimized random forest regression (RFR) algorithm is presented.
Abstract: This paper presents an improved machine learning approach for the accurate and robust state of charge (SOC) in electric vehicle (EV) batteries using differential search optimized random forest regression (RFR) algorithm. The precise SOC estimation confirms the safety and reliability of EV. Nevertheless, SOC is influenced by numerous factors which cannot be measured directly. RFR is suitable for real-time SOC estimation due to its robustness to noise, overfitting issues and capacity to work with huge datasets. However, proper selection of RFR architecture and hyper-parameters combination remains a key issue to be explored. Hence, a differential search algorithm (DSA) is employed to search for the optimal values of trees and leaves in the RFR algorithm. DSA optimized RFR eliminates the utilization of the filter in data pre-processing steps and does not require a detailed understanding and knowledge about battery chemistry, rather only needs sensors to monitor battery voltage and current. The developed approach is validated at room temperature using two types of lithium-ion batteries under a pulse discharge test. In addition, the proposed model is verified under varying temperature settings under EV drive cycles. The experimental results demonstrate that the DSA optimized RFR algorithm achieves RMSE of 0.382% in the HPPC test using LiNMC battery. Besides, the proposed method obtains satisfactory outcomes in EV drive cycles, estimating MAE of 0.193% and 0.346% in DST and FUDS cycles, respectively, at 25°C.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper used Machine Learning (ML) algorithms by innovative methods of hyperparameter optimization, such as halving search, grid search, random search, and the k-fold cross-validation, to derive the seismic fragility curve for accelerating seismic risk assessment.

Journal ArticleDOI
TL;DR: In this paper , a hybrid Random Forest, support vector regression, and response surface methodology are implemented to predict CO2 emission in 30 major cities in China, and seven optimizers are applied to the random forest, and two optimizers were applied to SVM to tune their hyper-parameters.