scispace - formally typeset
Search or ask a question

Showing papers on "Support vector machine published in 2022"


Journal ArticleDOI
TL;DR: In this paper , the authors used a standard fake hotel review dataset for experimenting and data preprocessing methods and a term frequency-Inverse document frequency (TF-IDF) approach for extracting features and their representation.
Abstract: Fake reviews, also known as deceptive opinions, are used to mislead people and have gained more importance recently. This is due to the rapid increase in online marketing transactions, such as selling and purchasing. E-commerce provides a facility for customers to post reviews and comment about the product or service when purchased. New customers usually go through the posted reviews or comments on the website before making a purchase decision. However, the current challenge is how new individuals can distinguish truthful reviews from fake ones, which later deceives customers, inflicts losses, and tarnishes the reputation of companies. The present paper attempts to develop an intelligent system that can detect fake reviews on e-commerce platforms using n-grams of the review text and sentiment scores given by the reviewer. The proposed methodology adopted in this study used a standard fake hotel review dataset for experimenting and data preprocessing methods and a term frequency-Inverse document frequency (TF-IDF) approach for extracting features and their representation. For detection and classification, n-grams of review texts were inputted into the constructed models to be classified as fake or truthful. However, the experiments were carried out using four different supervised machine-learning techniques and were trained and tested on a dataset collected from the Trip Advisor website. The classification results of these experiments showed that naïve Bayes (NB), support vector machine (SVM), adaptive boosting (AB), and random forest (RF) received 88%, 93%, 94%, and 95%, respectively, based on testing accuracy and tje F1-score. The obtained results were compared with existing works that used the same dataset, and the proposed methods outperformed the comparable methods in terms of accuracy.

104 citations



Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a SOH estimation method based on improved ant lion optimization algorithm and support vector regression (IALO-SVR), which can achieve accurate estimation of SOH with high estimation accuracy and robustness, and the estimation error is stable within 2%.
Abstract: The state of health (SOH) estimation plays an important role in keeping the safe and stable operation of lithium-ion battery management system (BMS). To solve the problem of low estimation accuracy of traditional estimation methods, this paper proposes a SOH estimation method based on improved ant lion optimization algorithm and support vector regression (IALO-SVR). Firstly, the data of battery charge and discharge are analyzed geometrically, and four health features highly correlated with SOH decline are selected as the input of SVR model. Pearson correlation coefficient is used to quantitatively analyze the correlation between features and SOH. On the other hand, the IALO algorithm is used to optimize the kernel parameters of SVR, and the SOH estimation model is obtained after training with battery training set. To verify this method, batteries in different working conditions are verified on NASA battery data set, and compared with ALO-SVR and SVR. The experimental results show that this method can achieve accurate estimation of SOH, with high estimation accuracy and robustness, and the estimation error is stable within 2%.

102 citations


Journal ArticleDOI
TL;DR: In this paper , the development of a prediction model by processing the variational parameters with machine learning and studying properties such as characterization, stability, and density of rGO-Fe3O4-TiO2 hybrid nanofluids has provided an unprecedented study in the literature.

84 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors developed an ensemble learning-based method to predict the slope stability by introducing the random forest (RF) and extreme gradient boosting (XGBoost), which is applied to the stability prediction of 786 landslide cases in Yunyang County, Chongqing, China.
Abstract: Slope stability prediction plays a significant role in landslide disaster prevention and mitigation. This study develops an ensemble learning-based method to predict the slope stability by introducing the random forest (RF) and extreme gradient boosting (XGBoost). As an illustration, the proposed approach is applied to the stability prediction of 786 landslide cases in Yunyang County, Chongqing, China. For comparison, the predictive performance of RF, XGBoost, support vector machine (SVM), and logistic regression (LR) is systematically investigated based on the well-established confusion matrix, which contains the known indices of recall rate, precision, and accuracy. Furthermore, the feature importance of the 12 influencing variables is also explored. Results show that the accuracy of the XGBoost and RF for both the training and testing data is superior to that of SVM and LR, revealing the superiority of the ensemble learning models (i.e. XGBoost and RF) in the slope stability prediction of Yunyang County. Among the 12 influencing factors, the profile shape is the most important one. The proposed ensemble learning-based method offers a promising way to rationally capture the slope status. It can be extended to the prediction of slope stability of other landslide-prone areas of interest.

77 citations


Journal ArticleDOI
TL;DR: A model using a fused machine learning approach for diabetes prediction based on the patient’s real-time medical record has a prediction accuracy of 94.87, which is higher than the previously published methods.
Abstract: In the medical field, it is essential to predict diseases early to prevent them. Diabetes is one of the most dangerous diseases all over the world. In modern lifestyles, sugar and fat are typically present in our dietary habits, which have increased the risk of diabetes. To predict the disease, it is extremely important to understand its symptoms. Currently, machine-learning (ML) algorithms are valuable for disease detection. This article presents a model using a fused machine learning approach for diabetes prediction. The conceptual framework consists of two types of models: Support Vector Machine (SVM) and Artificial Neural Network (ANN) models. These models analyze the dataset to determine whether a diabetes diagnosis is positive or negative. The dataset used in this research is divided into training data and testing data with a ratio of 70:30 respectively. The output of these models becomes the input membership function for the fuzzy model, whereas the fuzzy logic finally determines whether a diabetes diagnosis is positive or negative. A cloud storage system stores the fused models for future use. Based on the patient’s real-time medical record, the fused model predicts whether the patient is diabetic or not. The proposed fused ML model has a prediction accuracy of 94.87, which is higher than the previously published methods.

76 citations


Journal ArticleDOI
TL;DR: In this article , a new method for detecting COVID-19 and pneumonia using chest X-ray images was proposed, which can be described as a three-step process and achieved the highest testing classification accuracy of 96.6% using the VGG-19 model associated with the binary robust invariant scalable key-points (BRISK) algorithm.

76 citations


Journal ArticleDOI
TL;DR: It is proposed in this study that a unique intelligent diabetes mellitus prediction framework (IDMPF) is developed using machine learning after conducting a rigorous review of existing prediction models in the literature and examining their applicability to diabetes.
Abstract: Diabetes is a chronic disease that continues to be a significant and global concern since it affects the entire population's health. It is a metabolic disorder that leads to high blood sugar levels and many other problems such as stroke, kidney failure, and heart and nerve problems. Several researchers have attempted to construct an accurate diabetes prediction model over the years. However, this subject still faces significant open research issues due to a lack of appropriate data sets and prediction approaches, which pushes researchers to use big data analytics and machine learning (ML)-based methods. Applying four different machine learning methods, the research tries to overcome the problems and investigate healthcare predictive analytics. The study's primary goal was to see how big data analytics and machine learning-based techniques may be used in diabetes. The examination of the results shows that the suggested ML-based framework may achieve a score of 86. Health experts and other stakeholders are working to develop categorization models that will aid in the prediction of diabetes and the formulation of preventative initiatives. The authors perform a review of the literature on machine models and suggest an intelligent framework for diabetes prediction based on their findings. Machine learning models are critically examined, and an intelligent machine learning-based architecture for diabetes prediction is proposed and evaluated by the authors. In this study, the authors utilize our framework to develop and assess decision tree (DT)-based random forest (RF) and support vector machine (SVM) learning models for diabetes prediction, which are the most widely used techniques in the literature at the time of writing. It is proposed in this study that a unique intelligent diabetes mellitus prediction framework (IDMPF) is developed using machine learning. According to the framework, it was developed after conducting a rigorous review of existing prediction models in the literature and examining their applicability to diabetes. Using the framework, the authors describe the training procedures, model assessment strategies, and issues associated with diabetes prediction, as well as solutions they provide. The findings of this study may be utilized by health professionals, stakeholders, students, and researchers who are involved in diabetes prediction research and development. The proposed work gives 83% accuracy with the minimum error rate.

74 citations


Journal ArticleDOI
TL;DR: In this paper , a review of machine learning techniques applied for stock market prediction is presented, focusing on the stock markets investigated in the literature as well as the types of variables used as input in the machine learning methods used for predicting these markets.
Abstract: In this literature review, we investigate machine learning techniques that are applied for stock market prediction. A focus area in this literature review is the stock markets investigated in the literature as well as the types of variables used as input in the machine learning techniques used for predicting these markets. We examined 138 journal articles published between 2000 and 2019. The main contributions of this review are: (1) an extensive examination of the data, in particular, the markets and stock indices covered in the predictions, as well as the 2173 unique variables used for stock market predictions, including technical indicators, macro-economic variables, and fundamental indicators, and (2) an in-depth review of the machine learning techniques and their variants deployed for the predictions. In addition, we provide a bibliometric analysis of these journal articles, highlighting the most influential works and articles.

72 citations


Book ChapterDOI
01 Jan 2022
TL;DR: The objective of this study is to develop a model with significant accuracy to diagnose diabetes in patients and to enhance the accuracy in diabetes prediction using several machine learning algorithms.
Abstract: Diabetes is one among many chronic diseases. It is the most common disease and lots of peoples are affected by this. There are many things that are liable for diabetes, mainly age, obesity, weakness, sudden weight loss, and many more. Diabetes patients have high risk of diseases like cardiopathy, renal disorder, stroke, nerve damage, eye damage, etc. Detection of the disease isn’t very easy and prediction is additionally costlier. In today’s situation, hospitals are extremely busy due to COVID-19 pandemic, and it might be revolutionary if one could know if they’re at risk of being diabetic without visiting a doctor. But the rise in Artificial Intelligence techniques can be used for disease prognosis. The objective of this study is to develop a model with significant accuracy to diagnose diabetes in patients. Moreover, this paper also presents an effective diabetes prediction model for better classification of diabetes and to enhance the accuracy in diabetes prediction using several machine learning algorithms. Different machine learning algorithms are utilized for early stage diabetes prediction, namely, Logistic Regression, Random Forest Classifier, Support Vector Machine, Decision Trees, K-Nearest Neighbors, Gaussian Process Classifier, AdaBoost Classifier, and Gaussian Naive Bayes. The performances of these models are measured on respective criteria like Accuracy, Precision, Recall, F-Measure, and Error. For this research work, latest available dataset dated 22nd July, 2020, is being utilized. Latest updated dataset will show comparatively better result.

67 citations


Journal ArticleDOI
01 Dec 2022
TL;DR: Wang et al. as mentioned in this paper developed multiview robust double-sided twin SVM (MvRDTSVM) with SVM-type problems, which introduces a set of doublesided constraints into the proposed model to promote classification performance.
Abstract: Multiview learning (MVL), which enhances the learners’ performance by coordinating complementarity and consistency among different views, has attracted much attention. The multiview generalized eigenvalue proximal support vector machine (MvGSVM) is a recently proposed effective binary classification method, which introduces the concept of MVL into the classical generalized eigenvalue proximal support vector machine (GEPSVM). However, this approach cannot guarantee good classification performance and robustness yet. In this article, we develop multiview robust double-sided twin SVM (MvRDTSVM) with SVM-type problems, which introduces a set of double-sided constraints into the proposed model to promote classification performance. To improve the robustness of MvRDTSVM against outliers, we take L1-norm as the distance metric. Also, a fast version of MvRDTSVM (called MvFRDTSVM) is further presented. The reformulated problems are complex, and solving them are very challenging. As one of the main contributions of this article, we design two effective iterative algorithms to optimize the proposed nonconvex problems and then conduct theoretical analysis on the algorithms. The experimental results verify the effectiveness of our proposed methods.

Journal ArticleDOI
TL;DR: This study combines the new weighted kernel with SKELM and proposes a semi-supervised extreme learning machine algorithm based on the weighted kernel, SELMWK, which has good classification performance and can solve the semi- supervised gas classification task of the same domain data well on the used dataset.
Abstract: At present, machine sense of smell has shown its important role and advantages in many scenarios. The development of machine sense of smell is inseparable from the support of corresponding data and algorithms. However, the process of olfactory data collection is relatively cumbersome, and it is more difficult to collect labeled data. However, in many scenarios, to use a small amount of labeled data to train a good-performing classifier, it is not feasible to rely only on supervised learning algorithms, but semi-supervised learning algorithms can better cope with only a small amount of labeled data and a large amount of unlabeled data. This study combines the new weighted kernel with SKELM and proposes a semi-supervised extreme learning machine algorithm based on the weighted kernel, SELMWK. The experimental results show that the proposed SELMWK algorithm has good classification performance and can solve the semi-supervised gas classification task of the same domain data well on the used dataset.

Proceedings ArticleDOI
01 Jun 2022
TL;DR: RepLKNet as discussed by the authors proposes to use a few large convolutional kernels instead of a stack of small kernels to close the performance gap between CNNs and ViTs, achieving comparable or superior results than Swin Transformer on ImageNet.
Abstract: We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few large convolutional kernels instead of a stack of small kernels could be a more powerful paradigm. We suggested five guidelines, e.g., applying re-parameterized large depthwise convolutions, to design efficient high-performance large-kernel CNNs. Following the guidelines, we propose RepLKNet, a pure CNN architecture whose kernel size is as large as 31×31, in contrast to commonly used 3×3. RepLKNet greatly closes the performance gap between CNNs and ViTs, e.g., achieving comparable or superior results than Swin Transformer on ImageNet and a few typical downstream tasks, with lower latency. RepLKNet also shows nice scalability to big data and large models, obtaining 87.8% top-1 accuracy on ImageNet and 56.0% mIoU on ADE20K, which is very competitive among the state-of-the-arts with similar model sizes. Our study further reveals that, in contrast to small-kernel CNNs, large-kernel CNNs have much larger effective receptive fields and higher shape bias rather than texture bias. Code & models at https://github.com/megvii-research/RepLKNet.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed a novel scheme, namely convolutional neural network (CNN)-artificial bee colony (ABC) leveraged from CNN and ABC algorithm for HGR and voltage estimation.

Journal ArticleDOI
TL;DR: An Alzheimer's disease detection framework consisting of image denoising of an MRI input data set using an adaptive mean filter, preprocessing using histogram equalization, and feature extraction by Haar wavelet transform is presented.
Abstract: Alzheimer's disease is characterized by the presence of abnormal protein bundles in the brain tissue, but experts are not yet sure what is causing the condition. To find a cure or aversion, researchers need to know more than just that there are protein differences from the usual; they also need to know how these brain nerves form so that a remedy may be discovered. Machine learning is the study of computational approaches for enhancing performance on a specific task through the process of learning. This article presents an Alzheimer's disease detection framework consisting of image denoising of an MRI input data set using an adaptive mean filter, preprocessing using histogram equalization, and feature extraction by Haar wavelet transform. Classification is performed using LS-SVM-RBF, SVM, KNN, and random forest classifier. An adaptive mean filter removes noise from the existing MRI images. Image quality is enhanced by histogram equalization. Experimental results are compared using parameters such as accuracy, sensitivity, specificity, precision, and recall.

Journal ArticleDOI
TL;DR: The proposed SVM-based dementia, cancer, and diabetes from multifactorial genetic inheritance disorder prediction (MGIDP) give attractive results as compared with the proposed model of KNN, which plays a vital role to minimize the death ratio around the world.
Abstract: Fatal diseases like cancer, dementia, and diabetes are very dangerous. This leads to fear of death if these are not diagnosed at early stages. Computer science uses biomedical studies to diagnose cancer, dementia, and diabetes. With the advancement of machine learning, there are various techniques which are accessible to predict and prognosis these diseases based on different datasets. These datasets varied (image datasets and CSV datasets) around the world. So, there is a need for some machine learning classifiers to predict cancer, dementia, and diabetes in a human. In this paper, we used a multifactorial genetic inheritance disorder dataset to predict cancer, dementia, and diabetes. Several studies used different machine learning classifiers to predict cancer, dementia, and diabetes separately with the help of different types of datasets. So, in this paper, multiclass classification proposed methodology used support vector machine (SVM) and K-nearest neighbor (KNN) machine learning techniques to predict three diseases and compared these techniques based on accuracy. Simulation results have shown that the proposed model of SVM and KNN for prediction of dementia, cancer, and diabetes from multifactorial genetic inheritance disorder achieved 92.8% and 92.5%, 92.8% and 91.2% accuracy during training and testing, respectively. So, it is observed that proposed SVM-based dementia, cancer, and diabetes from multifactorial genetic inheritance disorder prediction (MGIDP) give attractive results as compared with the proposed model of KNN. The application of the proposed model helps to prognosis and prediction of cancer, dementia, and diabetes before time and plays a vital role to minimize the death ratio around the world.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors compared the performance of five popular machine learning methods, namely, particle swarm optimization-extreme learning machine (PSO-ELM), PSO-KELM, PSO, SVM and LSTM, in the prediction of reservoir landslide displacement.

Journal ArticleDOI
21 Jan 2022-Sensors
TL;DR: A new method for multiclass skin lesion classification using best deep learning feature fusion and an extreme learning machine is proposed and the method’s accuracy is improved and the proposed method is computationally efficient.
Abstract: The variation in skin textures and injuries, as well as the detection and classification of skin cancer, is a difficult task. Manually detecting skin lesions from dermoscopy images is a difficult and time-consuming process. Recent advancements in the domains of the internet of things (IoT) and artificial intelligence for medical applications demonstrated improvements in both accuracy and computational time. In this paper, a new method for multiclass skin lesion classification using best deep learning feature fusion and an extreme learning machine is proposed. The proposed method includes five primary steps: image acquisition and contrast enhancement; deep learning feature extraction using transfer learning; best feature selection using hybrid whale optimization and entropy-mutual information (EMI) approach; fusion of selected features using a modified canonical correlation based approach; and, finally, extreme learning machine based classification. The feature selection step improves the system’s computational efficiency and accuracy. The experiment is carried out on two publicly available datasets, HAM10000 and ISIC2018. The achieved accuracy on both datasets is 93.40 and 94.36 percent. When compared to state-of-the-art (SOTA) techniques, the proposed method’s accuracy is improved. Furthermore, the proposed method is computationally efficient.

Journal ArticleDOI
TL;DR: The proposed Deep neural model outperformed the other four classifiers (Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Logistic regression, Random Forest, and Naive Bayes classifier) by achieving 100% accuracy.
Abstract: Diabetes and high blood pressure are the primary causes of Chronic Kidney Disease (CKD). Glomerular Filtration Rate (GFR) and kidney damage markers are used by researchers around the world to identify CKD as a condition that leads to reduced renal function over time. A person with CKD has a higher chance of dying young. Doctors face a difficult task in diagnosing the different diseases linked to CKD at an early stage in order to prevent the disease. This research presents a novel deep learning model for the early detection and prediction of CKD. This research objectives to create a deep neural network and compare its performance to that of other contemporary machine learning techniques. In tests, the average of the associated features was used to replace all missing values in the database. After that, the neural network’s optimum parameters were fixed by establishing the parameters and running multiple trials. The foremost important features were selected by Recursive Feature Elimination (RFE). Hemoglobin, Specific Gravity, Serum Creatinine, Red Blood Cell Count, Albumin, Packed Cell Volume, and Hypertension were found as key features in the RFE. Selected features were passed to machine learning models for classification purposes. The proposed Deep neural model outperformed the other four classifiers (Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Logistic regression, Random Forest, and Naive Bayes classifier) by achieving 100% accuracy. The proposed approach could be a useful tool for nephrologists in detecting CKD.

Journal ArticleDOI
01 Aug 2022
TL;DR: In this paper , a hybrid model combined CNN and support vector machine (SVM) in terms of classification and with threshold-based segmentation for detection of brain tumor in MRI images.
Abstract: In this research paper, the brain MRI images are going to classify by considering the excellence of CNN on a public dataset to classify Benign and Malignant tumors. Deep learning (DL) methods due to good performance in the last few years have become more popular for Image classification. Convolution Neural Network (CNN), with several methods, can extract features without using handcrafted models, and eventually, show better accuracy of classification. The proposed hybrid model combined CNN and support vector machine (SVM) in terms of classification and with threshold-based segmentation in terms of detection. The findings of previous studies are based on different models with their accuracy as Rough Extreme Learning Machine (RELM)-94.233%, Deep CNN (DCNN)-95%, Deep Neural Network (DNN) and Discrete Wavelet Autoencoder (DWA)-96%, k-nearest neighbors (kNN)-96.6%, CNN-97.5%. The overall accuracy of the hybrid CNN-SVM is obtained as 98.4959%. In today's world, brain cancer is one of the most dangerous diseases with the highest death rate, detection and classification of brain tumors due to abnormal growth of cells, shapes, orientation, and the location is a challengeable task in medical imaging. Magnetic resonance imaging (MRI) is a typical method of medical imaging for brain tumor analysis. Conventional machine learning (ML) techniques categorize brain cancer based on some handicraft property with the radiologist specialist choice. That can lead to failure in the execution and also decrease the effectiveness of an Algorithm. With a brief look came to know that the proposed hybrid model provides more effective and improvement techniques for classification.

Journal ArticleDOI
TL;DR: In this article , the authors proposed a method to detect the leaf diseases in the tomato plant using support vector machine (SVM), convolutional neural network (CNN), and K-Nearest Neighbor (K-NN).
Abstract: Agriculture provides food to all the human beings even in case of rapid increase in the population. It is recommended to predict the plant diseases at their early stage in the field of agriculture is essential to cater the food to the overall population. But it unfortunate to predict the diseases at the early stage of the crops. The idea behind the paper is to bring awareness amongst the farmers about the cutting-edge technologies to reduces diseases in plant leaf. Since tomato is merely available vegetable, the approaches of machine learning and image processing with an accurate algorithm is identified to detect the leaf diseases in the tomato plant. In this investigation, the samples of tomato leaves having disorders are considered. With these disorder samples of tomato leaves, the farmers will easily find the diseases based on the early symptoms. Firstly, the samples of tomato leaves are resized to 256 × 256 pixels and then Histogram Equalization is used to improve the quality of tomato samples. The K-means clustering is introduced for partitioning of dataspace into Voronoi cells. The boundary of leaf samples is extracted using contour tracing. The multiple descriptors viz., Discrete Wavelet Transform, Principal Component Analysis and Grey Level Co-occurrence Matrix are used to extract the informative features of the leaf samples. Finally, the extracted features are classified using machine learning approaches such as Support Vector Machine (SVM), Convolutional Neural Network (CNN) and K-Nearest Neighbor (K-NN). The accuracy of the proposed model is tested using SVM (88%), K-NN (97%) and CNN (99.6%) on tomato disordered samples.

Journal ArticleDOI
TL;DR: In this article , a convolutional neural network (CNN) was used to segment brain tumours from 2D Magnetic Resonance brain Images (MRI) followed by traditional classifiers and deep learning methods.

Journal ArticleDOI
TL;DR: In this paper, the authors presented the utilization of several machine learning techniques such as Artificial Neural Network (ANN), Gradient Boosting (GB), Deep Neural Networks (DNN), Random Forest (RF), Stacking, K Nearest Neighbour (KNN), Support Vector Machine (SVM), Decision tree (DT) and Linear Regression (LR) for predicting annual building energy consumption using a large dataset of residential buildings.
Abstract: The high proportion of energy consumed in buildings has engendered the manifestation of many environmental problems which deploy adverse impacts on the existence of mankind. The prediction of building energy use is essentially proclaimed to be a method for energy conservation and improved decision-making towards decreasing energy usage. Also, the construction of energy efficient buildings will aid the reduction of total energy consumed in newly constructed buildings. Machine Learning (ML) method is recognised as the best suited approach for producing desired outcomes in prediction task. Hence, in several studies, ML has been applied in the field of energy consumption of operational building. However, there are not many studies investigating the suitability of ML methods for forecasting the potential building energy consumption at the early design phase to reduce the construction of more energy inefficient buildings. To address this gap, this paper presents the utilization of several machine learning techniques namely Artificial Neural Network (ANN), Gradient Boosting (GB), Deep Neural Network (DNN), Random Forest (RF), Stacking, K Nearest Neighbour (KNN), Support Vector Machine (SVM), Decision tree (DT) and Linear Regression (LR) for predicting annual building energy consumption using a large dataset of residential buildings. This study also examines the effect of the building clusters on the model performance. The novelty of this paper is to develop a model that enables designers input key features of a building design and forecast the annual average energy consumption at the early stages of development. This result reveals DNN as the most efficient predictive model for energy use at the early design phase and this presents a motivation for building designers to utilize it before construction to make informed decision, manage and optimize design.

Journal ArticleDOI
TL;DR: In this article , the authors presented the utilization of several machine learning techniques such as Artificial Neural Network (ANN), Gradient Boosting (GB), Deep Neural Networks (DNN), Random Forest (RF), Stacking, K Nearest Neighbour (KNN), Support Vector Machine (SVM), Decision tree (DT) and Linear Regression (LR) for predicting annual building energy consumption using a large dataset of residential buildings.
Abstract: The high proportion of energy consumed in buildings has engendered the manifestation of many environmental problems which deploy adverse impacts on the existence of mankind. The prediction of building energy use is essentially proclaimed to be a method for energy conservation and improved decision-making towards decreasing energy usage. Also, the construction of energy efficient buildings will aid the reduction of total energy consumed in newly constructed buildings. Machine Learning (ML) method is recognised as the best suited approach for producing desired outcomes in prediction task. Hence, in several studies, ML has been applied in the field of energy consumption of operational building. However, there are not many studies investigating the suitability of ML methods for forecasting the potential building energy consumption at the early design phase to reduce the construction of more energy inefficient buildings. To address this gap, this paper presents the utilization of several machine learning techniques namely Artificial Neural Network (ANN), Gradient Boosting (GB), Deep Neural Network (DNN), Random Forest (RF), Stacking, K Nearest Neighbour (KNN), Support Vector Machine (SVM), Decision tree (DT) and Linear Regression (LR) for predicting annual building energy consumption using a large dataset of residential buildings. This study also examines the effect of the building clusters on the model performance. The novelty of this paper is to develop a model that enables designers input key features of a building design and forecast the annual average energy consumption at the early stages of development. This result reveals DNN as the most efficient predictive model for energy use at the early design phase and this presents a motivation for building designers to utilize it before construction to make informed decision, manage and optimize design.

Journal ArticleDOI
TL;DR: The main purpose of this proposed work is to develop a system that can determine whether a tweet is “spam” or “ham” and evaluate the emotion of the tweet and create a learning model that will associate tweets with a particular sentiment.
Abstract: In this modern world, we are accustomed to a constant stream of data. Major social media sites like Twitter, Facebook, or Quora face a huge dilemma as a lot of these sites fall victim to spam accounts. These accounts are made to trap unsuspecting genuine users by making them click on malicious links or keep posting redundant posts by using bots. This can greatly impact the experiences that users have on these sites. A lot of time and research has gone into effective ways to detect these forms of spam. Performing sentiment analysis on these posts can help us in solving this problem effectively. The main purpose of this proposed work is to develop a system that can determine whether a tweet is “spam” or “ham” and evaluate the emotion of the tweet. The extracted features after preprocessing the tweets are classified using various classifiers, namely, decision tree, logistic regression, multinomial naïve Bayes, support vector machine, random forest, and Bernoulli naïve Bayes for spam detection. The stochastic gradient descent, support vector machine, logistic regression, random forest, naïve Bayes, and deep learning methods, namely, simple recurrent neural network (RNN) model, long short-term memory (LSTM) model, bidirectional long short-term memory (BiLSTM) model, and 1D convolutional neural network (CNN) model are used for sentiment analysis. The performance of each classifier is analyzed. The classification results showed that the features extracted from the tweets can be satisfactorily used to identify if a certain tweet is spam or not and create a learning model that will associate tweets with a particular sentiment.

Journal ArticleDOI
TL;DR: In this paper , a classification method for computed tomography chest images in the COVID-19 Radiography Database using features extracted by popular Convolutional Neural Networks (CNN) models was presented, and the determination of hyperparameters of Machine Learning (ML) algorithms by Bayesian optimization, and ANN-based image segmentation are the two main contributions.

Journal ArticleDOI
TL;DR: In this article , a new model based on machine learning algorithms to predict the final exam grades of undergraduate students, taking their midterm exam grades as the source data, was proposed and the results showed that the proposed model achieved a classification accuracy of 70-75%.
Abstract: Abstract Educational data mining has become an effective tool for exploring the hidden relationships in educational data and predicting students' academic achievements. This study proposes a new model based on machine learning algorithms to predict the final exam grades of undergraduate students, taking their midterm exam grades as the source data. The performances of the random forests, nearest neighbour, support vector machines, logistic regression, Naïve Bayes, and k-nearest neighbour algorithms, which are among the machine learning algorithms, were calculated and compared to predict the final exam grades of the students. The dataset consisted of the academic achievement grades of 1854 students who took the Turkish Language-I course in a state University in Turkey during the fall semester of 2019–2020. The results show that the proposed model achieved a classification accuracy of 70–75%. The predictions were made using only three types of parameters; midterm exam grades, Department data and Faculty data. Such data-driven studies are very important in terms of establishing a learning analysis framework in higher education and contributing to the decision-making processes. Finally, this study presents a contribution to the early prediction of students at high risk of failure and determines the most effective machine learning methods.

Journal ArticleDOI
TL;DR: An evolutionary approach for classifying and detecting breast cancer that is based on machine learning and image processing that is advantageous for accurately identifying breast cancer disease using image analysis is discussed.
Abstract: Breast cancer is the most lethal type of cancer for all women worldwide. At the moment, there are no effective techniques for preventing or curing breast cancer, as the source of the disease is unclear. Early diagnosis is a highly successful means of detecting and managing breast cancer, and early identification may result in a greater likelihood of complete recovery. Mammography is the most effective method of detecting breast cancer early. Additionally, this instrument enables the detection of additional illnesses and may provide information about the nature of cancer, such as benign, malignant, or normal. This article discusses an evolutionary approach for classifying and detecting breast cancer that is based on machine learning and image processing. This model combines image preprocessing, feature extraction, feature selection, and machine learning techniques to aid in the classification and identification of skin diseases. To enhance the image’s quality, a geometric mean filter is used. AlexNet is used for extracting features. Feature selection is performed using the relief algorithm. For disease categorization and detection, the model makes use of the machine learning techniques such as least square support vector machine, KNN, random forest, and Naïve Bayes. The experimental investigation makes use of MIAS data collection. This proposed technology is advantageous for accurately identifying breast cancer disease using image analysis.

Journal ArticleDOI
TL;DR: A systematic literature review that categorize, map and survey the existing literature on AI methods used to detect cybersecurity attacks in the IoT environment and provides an insight into the AI roadmap to detect threats based on attack categories is presented.
Abstract: In recent years, technology has advanced to the fourth industrial revolution (Industry 4.0), where the Internet of things (IoTs), fog computing, computer security, and cyberattacks have evolved exponentially on a large scale. The rapid development of IoT devices and networks in various forms generate enormous amounts of data which in turn demand careful authentication and security. Artificial intelligence (AI) is considered one of the most promising methods for addressing cybersecurity threats and providing security. In this study, we present a systematic literature review (SLR) that categorize, map and survey the existing literature on AI methods used to detect cybersecurity attacks in the IoT environment. The scope of this SLR includes an in-depth investigation on most AI trending techniques in cybersecurity and state-of-art solutions. A systematic search was performed on various electronic databases (SCOPUS, Science Direct, IEEE Xplore, Web of Science, ACM, and MDPI). Out of the identified records, 80 studies published between 2016 and 2021 were selected, surveyed and carefully assessed. This review has explored deep learning (DL) and machine learning (ML) techniques used in IoT security, and their effectiveness in detecting attacks. However, several studies have proposed smart intrusion detection systems (IDS) with intelligent architectural frameworks using AI to overcome the existing security and privacy challenges. It is found that support vector machines (SVM) and random forest (RF) are among the most used methods, due to high accuracy detection another reason may be efficient memory. In addition, other methods also provide better performance such as extreme gradient boosting (XGBoost), neural networks (NN) and recurrent neural networks (RNN). This analysis also provides an insight into the AI roadmap to detect threats based on attack categories. Finally, we present recommendations for potential future investigations.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors constructed a new intelligent diagnostic rule that is accurate, fast, noninvasive, and cost-effective, distinguishing between complicated and uncomplicated appendicitis.