Showing papers on "AdaBoost published in 2022"

PDF

Open Access

Journal Article•DOI•

An Ensemble Approach to Predict Early-Stage Diabetes Risk Using Machine Learning: An Empirical Study

[...]

Umm-e Laila, Khalid Mahboob, Abdul Wahid Khan, Faheem Khan, Whangbo Taekeun - Show less +1 more

01 Jul 2022-Sensors

TL;DR: The Random Forest Ensemble Method had the best accuracy (97%), whereas the AdaBoost and Bagging algorithms had lower accuracy, precision, recall, and F1-scores.

...read moreread less

Abstract: Diabetes is a long-lasting disease triggered by expanded sugar levels in human blood and can affect various organs if left untreated. It contributes to heart disease, kidney issues, damaged nerves, damaged blood vessels, and blindness. Timely disease prediction can save precious lives and enable healthcare advisors to take care of the conditions. Most diabetic patients know little about the risk factors they face before diagnosis. Nowadays, hospitals deploy basic information systems, which generate vast amounts of data that cannot be converted into proper/useful information and cannot be used to support decision making for clinical purposes. There are different automated techniques available for the earlier prediction of disease. Ensemble learning is a data analysis technique that combines multiple techniques into a single optimal predictive system to evaluate bias and variation, and to improve predictions. Diabetes data, which included 17 variables, were gathered from the UCI repository of various datasets. The predictive models used in this study include AdaBoost, Bagging, and Random Forest, to compare the precision, recall, classification accuracy, and F1-score. Finally, the Random Forest Ensemble Method had the best accuracy (97%), whereas the AdaBoost and Bagging algorithms had lower accuracy, precision, recall, and F1-scores.

...read moreread less

55 citations

Journal Article•DOI•

A Feature Selection Based on the Farmland Fertility Algorithm for Improved Intrusion Detection Systems

[...]

Touraj Sattari Naseri, Farhad Soleimanian Gharehchopogh

19 Mar 2022-Journal of Network and Systems Management

45 citations

Journal Article•DOI•

Machine learning modeling and analysis of biohydrogen production from wastewater by dark fermentation process

[...]

01 Jan 2022-Bioresource Technology

TL;DR: In this article , different ML procedures were assessed based on the mean squared error (MSE) and determination coefficient (R2) to select the most robust models for modeling the process.

...read moreread less

41 citations

Journal Article•DOI•

Machine learning modeling and analysis of biohydrogen production from wastewater by dark fermentation process.

[...]

Ahmad Hosseinzadeh¹, John L. Zhou¹, Ali Altaee¹, Donghao Li²•Institutions (2)

University of Technology, Sydney¹, Yanbian University²

01 Jan 2022-Bioresource Technology

TL;DR: In this paper, different ML procedures were assessed based on the mean squared error (MSE) and determination coefficient (R2) to select the most robust models for modeling the process.

...read moreread less

41 citations

Journal Article•DOI•

Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique

[...]

Raisa Abedin Disha, Sajjad Waheed

04 Jan 2022-Cybersecurity

TL;DR: In this paper , a Gini Impurity-based Weighted Random Forest (GIWRF) was used as the embedded feature selection technique for intrusion detection system (IDS) in order to protect the network, resources and sensitive data.

...read moreread less

Abstract: Abstract To protect the network, resources, and sensitive data, the intrusion detection system (IDS) has become a fundamental component of organizations that prevents cybercriminal activities. Several approaches have been introduced and implemented to thwart malicious activities so far. Due to the effectiveness of machine learning (ML) methods, the proposed approach applied several ML models for the intrusion detection system. In order to evaluate the performance of models, UNSW-NB 15 and Network TON_IoT datasets were used for offline analysis. Both datasets are comparatively newer than the NSL-KDD dataset to represent modern-day attacks. However, the performance analysis was carried out by training and testing the Decision Tree (DT), Gradient Boosting Tree (GBT), Multilayer Perceptron (MLP), AdaBoost, Long-Short Term Memory (LSTM), and Gated Recurrent Unit (GRU) for the binary classification task. As the performance of IDS deteriorates with a high dimensional feature vector, an optimum set of features was selected through a Gini Impurity-based Weighted Random Forest (GIWRF) model as the embedded feature selection technique. This technique employed Gini impurity as the splitting criterion of trees and adjusted the weights for two different classes of the imbalanced data to make the learning algorithm understand the class distribution. Based upon the importance score, 20 features were selected from UNSW-NB 15 and 10 features from the Network TON_IoT dataset. The experimental result revealed that DT performed well with the feature selection technique than other trained models of this experiment. Moreover, the proposed GIWRF-DT outperformed other existing methods surveyed in the literature in terms of the F1 score.

...read moreread less

41 citations

Journal Article•DOI•

Compressive strength prediction of fly ash-based geopolymer concrete via advanced machine learning techniques

[...]

Caroline Lindblad¹•Institutions (1)

COMSATS Institute of Information Technology¹

01 Jun 2022-Case Studies in Construction Materials

TL;DR: In this paper , the compressive strength of fly ash-based geopolymer concrete is estimated using decision tree, bagging regressor, and AdaBoost regressor with an R 2 value of 0.97.

...read moreread less

39 citations

Journal Article•DOI•

Application of Soft Computing Techniques to Predict the Strength of Geopolymer Composites

[...]

Qichen Wang, Waqas Ahmad, Ayaz Ahmad, Fahid Aslam, Abdullahi Ali Mohamed, Nikolay Vatin - Show less +2 more

01 Mar 2022-Polymers

TL;DR: It was discovered that ensembled machine learning techniques outperformed individual machineLearning techniques in forecasting the compressive strength of geopolymer composites, however, the outcomes of the individual machine learning model were also within the acceptable limit.

...read moreread less

Abstract: Geopolymers may be the best alternative to ordinary Portland cement because they are manufactured using waste materials enriched in aluminosilicate. Research on geopolymer composites is accelerating. However, considerable work, expense, and time are needed to cast, cure, and test specimens. The application of computational methods to the stated objective is critical for speedy and cost-effective research. In this study, supervised machine learning approaches were employed to predict the compressive strength of geopolymer composites. One individual machine learning approach, decision tree, and two ensembled machine learning approaches, AdaBoost and random forest, were used. The coefficient correlation (R2), statistical tests, and k-fold analysis were used to determine the validity and comparison of all models. It was discovered that ensembled machine learning techniques outperformed individual machine learning techniques in forecasting the compressive strength of geopolymer composites. However, the outcomes of the individual machine learning model were also within the acceptable limit. R2 values of 0.90, 0.90, and 0.83 were obtained for AdaBoost, random forest, and decision models, respectively. The models’ decreased error values, such as mean absolute error, mean absolute percentage error, and root-mean-square errors, further confirmed the ensembled machine learning techniques’ increased precision. Machine learning approaches will aid the building industry by providing quick and cost-effective methods for evaluating material properties.

...read moreread less

35 citations

Journal Article•DOI•

Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique

[...]

Raisa Abedin Disha, Sajjad Waheed

04 Jan 2022-Cybersecurity

...read moreread less

35 citations

Journal Article•DOI•

A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP)

[...]

Imesh Udara Ekanayake, D. Meddage, Upaka Rathnayake

01 Jun 2022-Case Studies in Construction Materials

TL;DR: In this article , a black-box interpretation approach was employed to elucidate the predictions of tree-based and LKRR algorithms for compressive strength prediction of concrete, and the comparison revealed that treebased algorithms and LkRR provided acceptable accuracy.

...read moreread less

35 citations

Journal Article•DOI•

Slope Stability Classification under Seismic Conditions Using Several Tree-Based Intelligent Techniques

[...]

Panagiotis G. Asteris, Fariz Iskandar Mohd Rizal, Mohammadreza Koopialipoor, Panayiotis C. Roussis, Maria Ferentinou, Danial Jahed Armaghani, Behrouz Gordan - Show less +3 more

08 Feb 2022-Applied Sciences

TL;DR: In this paper , the authors investigated the application of tree-based models, including decision tree (DT), random forest (RF), and AdaBoost, in slope stability classification under seismic loading conditions.

...read moreread less

Abstract: Slope stability analysis allows engineers to pinpoint risky areas, study trigger mechanisms for slope failures, and design slopes with optimal safety and reliability. Before the widespread usage of computers, slope stability analysis was conducted through semi analytical methods, or stability charts. Presently, engineers have developed many computational tools to perform slope stability analysis more efficiently. The challenge associated with furthering slope stability methods is to create a reliable design solution to perform reliable estimations involving a number of geometric and mechanical variables. The objective of this study was to investigate the application of tree-based models, including decision tree (DT), random forest (RF), and AdaBoost, in slope stability classification under seismic loading conditions. The input variables used in the modelling were slope height, slope inclination, cohesion, friction angle, and peak ground acceleration to classify safe slopes and unsafe slopes. The training data for the developed computational intelligence models resulted from a series of slope stability analyses performed using a standard geotechnical engineering software commonly used in geotechnical engineering practice. Upon construction of the tree-based models, the model assessment was performed through the use and calculation of accuracy, F1-score, recall, and precision indices. All tree-based models could efficiently classify the slope stability status, with the AdaBoost model providing the highest performance for the classification of slope stability for both model development and model assessment parts. The proposed AdaBoost model can be used as a screening tool during the stage of feasibility studies of related infrastructure projects, to classify slopes according to their expected status of stability under seismic loading conditions.

...read moreread less

34 citations

Journal Article•DOI•

Prediction of Pile Bearing Capacity Using XGBoost Algorithm: Modeling and Performance Evaluation

[...]

Maaz Amjad, Irshad Ahmad, Mahmood-ul-Hassan Ahmad, Piotr Wróblewski, Paweł Kamiński, Uzair Amjad - Show less +2 more

18 Feb 2022-Applied Sciences

TL;DR: In this article , a new model for predicting bearing capacity is developed using an extreme gradient boosting (XGBoost) algorithm using a total of 200 driven piles static load test-based case histories were used to construct and verify the model.

...read moreread less

Abstract: The major criteria that control pile foundation design is pile bearing capacity (Pu). The load bearing capacity of piles is affected by the various characteristics of soils and the involvement of multiple parameters related to both soil and foundation. In this study, a new model for predicting bearing capacity is developed using an extreme gradient boosting (XGBoost) algorithm. A total of 200 driven piles static load test-based case histories were used to construct and verify the model. The developed XGBoost model results were compared to a number of commonly used algorithms—Adaptive Boosting (AdaBoost), Random Forest (RF), Decision Tree (DT) and Support Vector Machine (SVM) using various performance measure metrics such as coefficient of determination, mean absolute error, root mean square error, mean absolute relative error, Nash–Sutcliffe model efficiency coefficient and relative strength ratio. Furthermore, sensitivity analysis was performed to determine the effect of input parameters on Pu. The results show that all of the developed models were capable of making accurate predictions however the XGBoost algorithm surpasses others, followed by AdaBoost, RF, DT, and SVM. The sensitivity analysis result shows that the SPT blow count along the pile shaft has the greatest effect on the Pu.

...read moreread less

Journal Article•DOI•

High-performance concrete strength prediction based on ensemble learning

[...]

Qing Fu Li, Zongming Song

01 Mar 2022-Construction and Building Materials

TL;DR: In this article , compressive strength and tensile strength tests were conducted on high performance concrete (HPC) with fly ash and silica fume separately and together, and with polypropylene fiber in triple-blending.

...read moreread less

Journal Article•DOI•

Medium-term load forecasting in isolated power systems based on ensemble machine learning models

[...]

Pavel Matrenin, Murodbek Safaraliev, Stepan N. Dmitriev, Sergey Kokin, Anvari Ghulomzoda, S. M. Mitrofanov - Show less +2 more

01 Apr 2022-Energy Reports

TL;DR: In this article , a model of medium-term forecasting of load graphs for electric power system (EPS) with specific properties, based on the use of ensemble machine learning methods is proposed.

...read moreread less

Journal Article•DOI•

Prostate cancer classification from ultrasound and MRI images using deep learning based Explainable Artificial Intelligence

[...]

Md. Rafiul Hassan¹, Ted Fleming², Md. Fakrul Islam, Md. Zia Uddin³, Goutam Ghoshal, Mohammad Mehedi Hassan², Shamsul Huda⁴, Giancarlo Fortino⁵ - Show less +4 more•Institutions (5)

University of Maine at Presque Isle¹, King Saud University², SINTEF³, Deakin University⁴, University of Calabria⁵

01 Feb 2022-Future Generation Computer Systems

TL;DR: A novel automated classification algorithm by fusing a number of deep learning approaches has been proposed to detect prostate cancer from ultrasound (US) and MRI images and explains why a specific decision is made given the input US or MRI image.

...read moreread less

Journal Article•DOI•

Real-time prediction of rock mass classification based on TBM operation big data and stacking technique of ensemble learning

[...]

01 Feb 2022-Journal of rock mechanics and geotechnical engineering

TL;DR: Wang et al. as discussed by the authors proposed a stacking ensemble classifier for the real-time prediction of the rock mass classification using tunnel boring machines (TBMs) operation data, which showed a more powerful learning and generalisation ability for small and imbalanced samples.

...read moreread less

Abstract: Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines (TBMs). During the TBM tunnelling process, a large number of operation data are generated, reflecting the interaction between the TBM system and surrounding rock, and these data can be used to evaluate the rock mass quality. This study proposed a stacking ensemble classifier for the real-time prediction of the rock mass classification using TBM operation data. Based on the Songhua River water conveyance project, a total of 7538 TB M tunnelling cycles and the corresponding rock mass classes are obtained after data preprocessing. Then, through the tree-based feature selection method, 10 key TBM operation parameters are selected, and the mean values of the 10 selected features in the stable phase after removing outliers are calculated as the inputs of classifiers. The preprocessed data are randomly divided into the training set (90%) and test set (10%) using simple random sampling. Besides stacking ensemble classifier, seven individual classifiers are established as the comparison. These classifiers include support vector machine (SVM), k-nearest neighbors (KNN), random forest (RF), gradient boosting decision tree (GBDT), decision tree (DT), logistic regression (LR) and multi-layer perceptron (MLP), where the hyper-parameters of each classifier are optimised using the grid search method. The prediction results show that the stacking ensemble classifier has a better performance than individual classifiers, and it shows a more powerful learning and generalisation ability for small and imbalanced samples. Additionally, a relative balance training set is obtained by the synthetic minority oversampling technique (SMOTE), and the influence of sample imbalance on the prediction performance is discussed.

...read moreread less

Journal Article•DOI•

Application of a modern multi-level ensemble approach for the estimation of critical shear stress in cohesive sediment mixture

[...]

R. K. Singh, Mehdi Jamei, Masoud Karbasi, Anurag Malik, Manish Pandey - Show less +1 more

01 Feb 2022-Journal of hydrology

TL;DR: In this article , a multi-level ensemble machine learning (ML) was used to determine critical shear stress (CSS) of gravel particles in a cohesive mixture of clay-silt-gravel.

...read moreread less

Abstract: Exploration of incipient motion study is significantly important for the river hydraulics community. The present study, along with experimental investigation, considered a new multi-level ensemble machine learning (ML) to determine critical shear stress (CSS) of gravel particles in a cohesive mixture of clay-silt-gravel, clay-silt-sand-gravel, and clay-sand-gravel. The multi-level ensemble ML included a voting-based ensemble meta-estimator integrated with three modern standalone ensemble techniques, namely extreme gradient boosting (XGBoost), Adaptive boosting (Adaboost), and Random Forest (RF), and performance is compared with three standalone ensemble models for prediction of CSS values. Besides, the optimum input combinations were explored using the forward stepwise selection method, as a correlation-based feature selection, and mutual information theory. The outcomes of simulation indicated that the multi-level ensemble machine learning (voting) model in terms of correlation coefficient (R = 0.9641), and root mean square error (RMSE = 0.2022) was superior to the standalone ensemble techniques, i.e., XGBoost (R = 0.9482, and RMSE = 0.2375), Adaboost (R = 0.9496, and RMSE = 0.2387), and RF (R = 0.9392, and RMSE = 0.2739) for accurate estimation of CSS.

...read moreread less

Journal Article•DOI•

Estimation of tetracycline antibiotic photodegradation from wastewater by heterogeneous metal-organic frameworks photocatalysts

[...]

Jafar Abdi¹, Abdolhossein Hemmati-Sarapardeh²•Institutions (2)

University of Shahrood¹, Shahid Bahonar University of Kerman²

01 Jan 2022-Chemosphere

TL;DR: In this paper , the potential ability of various modern and powerful machine learning methods such as Categorical Boosting (CatBoost), Light Gradient Boosting Machine (LightGBM), XGBoost, AdaBoost, GBDT, ET, DT, and Random Forest (RF) were investigated to estimate tetracycline (TC) photodegradation from wastewater by 10 different metal-organic frameworks (MOFs).

...read moreread less

Journal Article•DOI•

Estimation of tetracycline antibiotic photodegradation from wastewater by heterogeneous metal-organic frameworks photocatalysts.

[...]

Jafar Abdi¹, Jafar Abdi¹, Abdolhossein Hemmati-Sarapardeh², Masoud Hadipoor³, Fahimeh Hadavimoghaddam, Abdolhossein Hemmati-Sarapardeh², Abdolhossein Hemmati-Sarapardeh⁴ - Show less +3 more•Institutions (4)

University of Shahrood¹, Shahid Bahonar University of Kerman², Petroleum University of Technology³, Jilin University⁴

01 Jan 2022-Chemosphere

TL;DR: In this paper, the potential ability of various modern and powerful machine learning methods such as Categorical Boosting (CatBoost), Light Gradient Boosting Machine (LightGBM), XGBoost, AdaBoost, GBDT, ET, DT, and Random Forest (RF) were investigated to estimate tetracycline (TC) photodegradation from wastewater by 10 different metal-organic frameworks (MOFs).

...read moreread less

Journal Article•DOI•

A Neural Network Ensemble With Feature Engineering for Improved Credit Card Fraud Detection

[...]

01 Jan 2022-IEEE Access

TL;DR: Wang et al. as discussed by the authors proposed an efficient approach to detect credit card fraud using a neural network ensemble classifier and a hybrid data resampling method, which is obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting technique.

...read moreread less

Abstract: Recent advancements in electronic commerce and communication systems have significantly increased the use of credit cards for both online and regular transactions. However, there has been a steady rise in fraudulent credit card transactions, costing financial companies huge losses every year. The development of effective fraud detection algorithms is vital in minimizing these losses, but it is challenging because most credit card datasets are highly imbalanced. Also, using conventional machine learning algorithms for credit card fraud detection is inefficient due to their design, which involves a static mapping of the input vector to output vectors. Therefore, they cannot adapt to the dynamic shopping behavior of credit card clients. This paper proposes an efficient approach to detect credit card fraud using a neural network ensemble classifier and a hybrid data resampling method. The ensemble classifier is obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting (AdaBoost) technique. Meanwhile, the hybrid resampling is achieved using the synthetic minority oversampling technique and edited nearest neighbor (SMOTE-ENN) method. The effectiveness of the proposed method is demonstrated using publicly available real-world credit card transaction datasets. The performance of the proposed approach is benchmarked against the following algorithms: support vector machine (SVM), multilayer perceptron (MLP), decision tree, traditional AdaBoost, and LSTM. The experimental results show that the classifiers performed better when trained with the resampled data, and the proposed LSTM ensemble outperformed the other algorithms by obtaining a sensitivity and specificity of 0.996 and 0.998, respectively.

...read moreread less

Journal Article•DOI•

Prostate cancer classification from ultrasound and MRI images using deep learning based Explainable Artificial Intelligence

[...]

Ted Fleming¹•Institutions (1)

King Saud University¹

01 Feb 2022-Future Generation Computer Systems

TL;DR: In this article , a novel automated classification algorithm by fusing a number of deep learning approaches has been proposed to detect prostate cancer from ultrasound (US) and MRI images, and the proposed method explains why a specific decision is made given the input US or MRI image.

...read moreread less

Journal Article•DOI•

AdaBoost Ensemble Methods Using K-Fold Cross Validation for Survivability with the Early Detection of Heart Disease

[...]

T. R. Mahesh, V. Dhilip Kumar, V. Vinoth Kumar, Junaid Asghar, Oana Geman, G. Arulkumaran, N. Arun - Show less +3 more

18 Apr 2022-Computational Intelligence and Neuroscience

TL;DR: The experimental results demonstrate that the AdaBoost-Random Forest classifier provides 95.47% accuracy in the early detection of heart disease.

...read moreread less

Abstract: As a result of technology improvements, various features have been collected for heart disease diagnosis. Large data sets have several drawbacks, including limited storage capacity and long access and processing times. For medical therapy, early diagnosis of heart problems is crucial. Disease of heart is a devastating human disease that is quickly increasing in developed and also developing countries, resulting in death. In this type of disease, the heart normally fails to provide enough blood to different body parts in order to allow them to perform their regular functions. Early, as well as, proper diagnosis of this condition is very critical for averting further damage and also to save patients' lives. In this work, machine learning (ML) is utilized to find out whether a person has cardiac disease or not. Both the types of ensemble classifiers, namely, homogeneous as well as heterogeneous classifiers (formed by combining two separate classifiers), have been implemented in this work. The data mining preprocessing using Synthetic Minority Oversampling Technique (SMOTE) has been employed to cope with the imbalance problem of the class as well as noise. The proposed work has two steps. SMOTE is used in the initial phase to reduce the impact of data imbalance and the second phase is classifying data using Naive Bayes (NB), decision tree (DT) algorithms, and their ensembles. The experimental results demonstrate that the AdaBoost-Random Forest classifier provides 95.47% accuracy in the early detection of heart disease.

...read moreread less

Journal Article•DOI•

A Machine Learning Method with Filter-Based Feature Selection for Improved Prediction of Chronic Kidney Disease

[...]

Sarah A. Ebiaredoh-Mienye, Theo G. Swart, Ebenezer Esenogho, Ibomoiye Domor Mienye

28 Jul 2022-Bioengineering

TL;DR: The proposed approach to effectively detect CKD by combining the information-gain-based feature selection technique and a cost-sensitive adaptive boosting (AdaBoost) classifier has produced an effective predictive model for CKD diagnosis and could be applied to more imbalanced medical datasets for effective disease detection.

...read moreread less

Abstract: The high prevalence of chronic kidney disease (CKD) is a significant public health concern globally. The condition has a high mortality rate, especially in developing countries. CKD often go undetected since there are no obvious early-stage symptoms. Meanwhile, early detection and on-time clinical intervention are necessary to reduce the disease progression. Machine learning (ML) models can provide an efficient and cost-effective computer-aided diagnosis to assist clinicians in achieving early CKD detection. This research proposed an approach to effectively detect CKD by combining the information-gain-based feature selection technique and a cost-sensitive adaptive boosting (AdaBoost) classifier. An approach like this could save CKD screening time and cost since only a few clinical test attributes would be needed for the diagnosis. The proposed approach was benchmarked against recently proposed CKD prediction methods and well-known classifiers. Among these classifiers, the proposed cost-sensitive AdaBoost trained with the reduced feature set achieved the best classification performance with an accuracy, sensitivity, and specificity of 99.8%, 100%, and 99.8%, respectively. Additionally, the experimental results show that the feature selection positively impacted the performance of the various classifiers. The proposed approach has produced an effective predictive model for CKD diagnosis and could be applied to more imbalanced medical datasets for effective disease detection.

...read moreread less

Proceedings Article•DOI•

Lung cancer prediction model using ensemble learning techniques and a systematic review analysis

[...]

Muntasir Mamun, Afia Farjana, Miraz Al Mamun, Md. Salim Ahammed

06 Jun 2022

TL;DR: The newly developed ensemble learning techniques were developed based on a survey dataset of 309 people with or without lung cancer by oversampling SMOTE method and the ensemble techniques used are XGBoost, LightGBM, Bagging, and AdaBoost by k-fold 10 cross-validation method.

...read moreread less

Abstract: Lung cancers are malignant lung tumors resulting from uncontrolled growth of lung cells that metastasizes to other parts of the body and can cause death. Although lung cancer cannot be prevented, the risk of cancer development can be lowered. Early detection of lung cancer is essential for patient survival, and machine learning-based prediction models have potential use in predicting lung cancer. Ensemble techniques are compelling and powerful techniques in Machine Learning to improve the prediction accuracy as classifiers. This paper reviewed some research articles on lung cancer prediction models that used machine learning and ensemble learning techniques. Furthermore, we added our newly developed ensemble learning techniques to this paper which was developed based on a survey dataset of 309 people with or without lung cancer by oversampling SMOTE method. The ensemble techniques we used are XGBoost, LightGBM, Bagging, and AdaBoost by k-fold 10 cross-validation method and the attributes our lung cancer prediction models used are age, smoking, yellow fingers, anxiety, peer pressure, chronic disease, fatigue, allergy, wheezing, alcohol, coughing, shortness of breath, swallowing difficulty, and chest pain. Results: According to our analysis, the XGBoost technique performed better than other ensemble techniques and achieved an accuracy of 94.42 %, precision of 95.66%, recall of 94.46%, and AUC of 98.14%, respectively.

...read moreread less

Journal Article•DOI•

Novel ensemble intelligence methodologies for rockburst assessment in complex and variable environments

[...]

Diyuan Li, Zida Liu, Danial Jahed Armaghani, Peng Xiao, Jian Zhou - Show less +1 more

03 Feb 2022-Dental science reports

TL;DR: Wang et al. as discussed by the authors investigated the ensemble trees, i.e., random forest (RF), extremely randomized tree (ET), adaptive boosting machine (AdaBoost), gradient boosting machine, extreme gradient boosting machines (XGBoost), light gradient boosting Machine (LGM), and category gradient boosting mechanism (CGM), for predicting strong rockburst.

...read moreread less

Abstract: Rockburst is a severe geological hazard that restricts deep mine operations and tunnel constructions. To overcome the shortcomings of widely used algorithms in rockburst prediction, this study investigates the ensemble trees, i.e., random forest (RF), extremely randomized tree (ET), adaptive boosting machine (AdaBoost), gradient boosting machine, extreme gradient boosting machine (XGBoost), light gradient boosting machine, and category gradient boosting machine, for rockburst estimation based on 314 real rockburst cases. Additionally, Bayesian optimization is utilized to optimize these ensemble trees. To improve performance, three combination strategies, voting, bagging, and stacking, are adopted to combine multiple models according to training accuracy. ET and XGBoost receive the best capabilities (85.71% testing accuracy) in single models, and except for AdaBoost, six ensemble trees have high accuracy and can effectively foretell strong rockburst to prevent large-scale underground disasters. The combination models generated by voting, bagging, and stacking perform better than single models, and the voting 2 model that combines XGBoost, ET, and RF with simple soft voting, is the most outstanding (88.89% testing accuracy). The performed sensitivity analysis confirms that the voting 2 model has better robustness than single models and has remarkable adaptation and superiority when input parameters vary or miss, and it has more power to deal with complex and variable engineering environments. Eventually, the rockburst cases in Sanshandao Gold Mine, China, were investigated, and these data verify the practicability of voting 2 in field rockburst prediction.

...read moreread less

Journal Article•DOI•

Estimating the thermal conductivity of soils using six machine learning algorithms

[...]

Kaiqi Li, Yong Liu, Qing Kang

01 Jul 2022-International Communications in Heat and Mass Transfer

TL;DR: In this article , a large database containing 2197 data points from various literature was compiled and six machine learning algorithms, namely multivariance linear regression (MLR), Gaussian process regression (GPR), support vector machine (SVM), decision tree (DT), random forest (RF) and adaptive boosting methods (AdaBoost), were implemented to predict the thermal conductivity of soils based on the compiled database.

...read moreread less

Journal Article•DOI•

How can we manage Offensive Text in Social Media - A Text Classification Approach using LSTM-BOOST

[...]

Md. Anwar Hussen Wadud, Muhammad Mohsin Kabir, M. F. Mridha, M. Ameer Ali, Md. Abdul Hamid, Muhammad Mostafa Monowar - Show less +2 more

01 Nov 2022-International journal of information management data insights

TL;DR: In this article , the authors proposed an offensive text classification algorithm named LSTM-BOOST employing Long Short-Term Memory(LSTM) model with ensemble learning to recognize offensive Bengali texts in various social media platforms.

...read moreread less

Abstract: Recently, offensive content has become increasingly popular for harassing and criticizing people on numerous social media platforms. This paper proposes an offensive text classification algorithm named LSTM-BOOST employing Long Short-Term Memory(LSTM) model with ensemble learning to recognize offensive Bengali texts in various social media platforms. The proposed LSTM-BOOST model uses the modified AdaBoost algorithm employing principal component analysis(PCA) along with LSTM networks. In the LSTM-Boost model, the dataset is divided into three categories, and PCA and LSTM networks are applied to each part of the dataset to obtain the most significant variance and reduce the weighted error of the weak hypothesis of the model. Furthermore, different classifiers are used for baseline experiment and the model is evaluated on various word embedding vector methods. Our investigation found that the LSTM-BOOST algorithms outperform most of the baseline architecture, leading F1-score of 92.61% on the Bengali offensive text from Social Platforms(BHSSP) dataset.

...read moreread less

Journal Article•DOI•

Machine learning approaches in Covid-19 severity risk prediction in Morocco

[...]

Mariam Laatifi, Samira Douzi, Abdelaziz Bouklouz, Hind Ezzine, Jaafar Jaafari, Younes Zaid, Bouabid El Ouahidi, Mariam Naciri - Show less +4 more

06 Jan 2022-Journal of Big Data

TL;DR: In this paper , the authors developed and tested machine learning-based models for COVID-19 severity prediction, which achieved 100% accuracy, specificity, sensitivity, and ROC curve in conducting a prognostic prediction using different machine learning classifiers.

...read moreread less

Abstract: The purpose of this study is to develop and test machine learning-based models for COVID-19 severity prediction. COVID-19 test samples from 337 COVID-19 positive patients at Cheikh Zaid Hospital were grouped according to the severity of their illness. Ours is the first study to estimate illness severity by combining biological and non-biological data from patients with COVID-19. Moreover the use of ML for therapeutic purposes in Morocco is currently restricted, and ours is the first study to investigate the severity of COVID-19. When data analysis approaches were used to uncover patterns and essential characteristics in the data, C-reactive protein, platelets, and D-dimers were determined to be the most associated to COVID-19 severity prediction. In this research, many data reduction algorithms were used, and Machine Learning models were trained to predict the severity of sickness using patient data. A new feature engineering method based on topological data analysis called Uniform Manifold Approximation and Projection (UMAP) shown that it achieves better results. It has 100% accuracy, specificity, sensitivity, and ROC curve in conducting a prognostic prediction using different machine learning classifiers such as X_GBoost, AdaBoost, Random Forest, and ExtraTrees. The proposed approach aims to assist hospitals and medical facilities in determining who should be seen first and who has a higher priority for admission to the hospital.

...read moreread less

Journal Article•DOI•

Comparative Study of Machine Learning Classifiers for Modelling Road Traffic Accidents

[...]

Tebogo Bokaba, Wesley Doorsamy, Babu Sena Paul

14 Jan 2022-Applied Sciences

TL;DR: Analysis of widely used machine learning classifiers using a real-life RTA dataset from Gauteng, South Africa shows that the RF classifier, combined with multiple imputations by chained equations, yielded the best performance when compared with the other combinations.

...read moreread less

Abstract: Road traffic accidents (RTAs) are a major cause of injuries and fatalities worldwide. In recent years, there has been a growing global interest in analysing RTAs, specifically concerned with analysing and modelling accident data to better understand and assess the causes and effects of accidents. This study analysed the performance of widely used machine learning classifiers using a real-life RTA dataset from Gauteng, South Africa. The study aimed to assess prediction model designs for RTAs to assist transport authorities and policymakers. It considered classifiers such as naïve Bayes, logistic regression, k-nearest neighbour, AdaBoost, support vector machine, random forest, and five missing data methods. These classifiers were evaluated using five evaluation metrics: accuracy, root-mean-square error, precision, recall, and receiver operating characteristic curves. Furthermore, the assessment involved parameter adjustment and incorporated dimensionality reduction techniques. The empirical results and analyses show that the RF classifier, combined with multiple imputations by chained equations, yielded the best performance when compared with the other combinations.

...read moreread less

Journal Article•DOI•

An empowered AdaBoost algorithm implementation: A COVID-19 dataset study

[...]

Ender Sevinc

05 Jan 2022-Computers & Industrial Engineering

TL;DR: In this article , the authors proposed an improved learning model to predict the severity of the patients by exploiting a combination of machine learning techniques, which used an adaptive boost algorithm with a decision tree estimator and a new parameter tuning process.

...read moreread less

Journal Article•DOI•

Comparative performance of eight ensemble learning approaches for the development of models of slope stability prediction

[...]

Sha Lin, Bei Han, Yanyan Li, Chao Han, Wei Li - Show less +1 more

01 Apr 2022-Acta Geotechnica

TL;DR: The analysis of engineering examples shows that the ensemble learning algorithm can deal with geotechnical material variables well and give accurate and reliable prediction results, which has good applicability for slope stability evaluation.

...read moreread less

Collapse