scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 2021"


Book
30 Sep 2021
TL;DR: This book discusses Exploratory Data Analysis, Hierarchical Methods Optimization Methods-k-Means, and more.
Abstract: INTRODUCTION TO EXPLORATORY DATA ANALYSIS Introduction to Exploratory Data Analysis What Is Exploratory Data Analysis Overview of the Text A Few Words about Notation Data Sets Used in the Book Transforming Data EDA AS PATTERN DISCOVERY Dimensionality Reduction - Linear Methods Introduction Principal Component Analysis (PCA) Singular Value Decomposition (SVD) Nonnegative Matrix Factorization Factor Analysis Fisher's Linear Discriminant Intrinsic Dimensionality Dimensionality Reduction - Nonlinear Methods Multidimensional Scaling (MDS) Manifold Learning Artificial Neural Network Approaches Data Tours Grand Tour Interpolation Tours Projection Pursuit Projection Pursuit Indexes Independent Component Analysis Finding Clusters Introduction Hierarchical Methods Optimization Methods-k-Means Spectral Clustering Document Clustering Evaluating the Clusters Model-Based Clustering Overview of Model-Based Clustering Finite Mixtures Expectation-Maximization Algorithm Hierarchical Agglomerative Model-Based Clustering Model-Based Clustering MBC for Density Estimation and Discriminant Analysis Generating Random Variables from a Mixture Model Smoothing Scatterplots Introduction Loess Robust Loess Residuals and Diagnostics with Loess Smoothing Splines Choosing the Smoothing Parameter Bivariate Distribution Smooths Curve Fitting Toolbox GRAPHICAL METHODS FOR EDA Visualizing Clusters Dendrogram Treemaps Rectangle Plots ReClus Plots Data Image Distribution Shapes Histograms Boxplots Quantile Plots Bagplots Rangefinder Boxplot Multivariate Visualization Glyph Plots Scatterplots Dynamic Graphics Coplots Dot Charts Plotting Points as Curves Data Tours Revisited Biplots Appendix A: Proximity Measures Appendix B: Software Resources for EDA Appendix C: Description of Data Sets Appendix D: Introduction to MATLAB Appendix E: MATLAB Functions References Index Summary, Further Reading, and Exercises appear at the end of each chapter.

320 citations


Journal ArticleDOI
Qian Shi1, Hui Zhang1
TL;DR: Experimental results and comparisons of an automated vehicle illustrate the effectiveness of the proposed algorithm on the steering actuator fault diagnosis and show that the proposed algorithms has superiority on the classification over existing methods.
Abstract: Safety is one of the key requirements for automated vehicles and fault diagnosis is an effective technique to enhance the vehicle safety. The model-based fault diagnosis method models the fault into the system model and estimates the faults by observer. In this article, to avoid the complexity of designing observer, we investigate the problem of steering actuator fault diagnosis for automated vehicles based on the approach of model-based support vector machine (SVM) classification. The system model is utilized to generate the residual signal as the training data and the data-based algorithm of the SVM classification is employed to diagnose the fault. Due to the phenomena of data unbalance induced poor performance of the data-driven method, an undersampling procedure with the approach of linear discriminant analysis and a threshold adjustment using the algorithm of grey wolf optimizer are proposed to modify and improve the performance of classification and fault diagnosis. Various comparisons are carried out based on widely used datasets. The comparison results show that the proposed algorithm has superiority on the classification over existing methods. Experimental results and comparisons of an automated vehicle illustrate the effectiveness of the proposed algorithm on the steering actuator fault diagnosis.

158 citations


Journal ArticleDOI
01 Jan 2021
TL;DR: CART, along with RS or QT, outperforms all other ML algorithms with 100% accuracy, 100% precision, 99% recall, and 100% F1 score, and the study outcomes demonstrate that the model’s performance varies depending on the data scaling method.
Abstract: Heart disease, one of the main reasons behind the high mortality rate around the world, requires a sophisticated and expensive diagnosis process. In the recent past, much literature has demonstrated machine learning approaches as an opportunity to efficiently diagnose heart disease patients. However, challenges associated with datasets such as missing data, inconsistent data, and mixed data (containing inconsistent missing data both as numerical and categorical) are often obstacles in medical diagnosis. This inconsistency led to a higher probability of misprediction and a misled result. Data preprocessing steps like feature reduction, data conversion, and data scaling are employed to form a standard dataset—such measures play a crucial role in reducing inaccuracy in final prediction. This paper aims to evaluate eleven machine learning (ML) algorithms—Logistic Regression (LR), Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN), Classification and Regression Trees (CART), Naive Bayes (NB), Support Vector Machine (SVM), XGBoost (XGB), Random Forest Classifier (RF), Gradient Boost (GB), AdaBoost (AB), Extra Tree Classifier (ET)—and six different data scaling methods—Normalization (NR), Standscale (SS), MinMax (MM), MaxAbs (MA), Robust Scaler (RS), and Quantile Transformer (QT) on a dataset comprising of information of patients with heart disease. The result shows that CART, along with RS or QT, outperforms all other ML algorithms with 100% accuracy, 100% precision, 99% recall, and 100% F1 score. The study outcomes demonstrate that the model’s performance varies depending on the data scaling method.

128 citations


Journal ArticleDOI
TL;DR: In this paper, two sliding window techniques are proposed to enhance the binary classification of motor imagery (MI) brain-computer interface (BCI) signals, namely SW-LCR and SW-Mode.
Abstract: Accurate binary classification of electroencephalography (EEG) signals is a challenging task for the development of motor imagery (MI) brain–computer interface (BCI) systems. In this study, two sliding window techniques are proposed to enhance the binary classification of MI. The first one calculates the longest consecutive repetition (LCR) of the sequence of prediction of all the sliding windows and is named SW-LCR. The second calculates the mode of the sequence of prediction of all the sliding windows and is named SW-Mode. Common spatial pattern (CSP) is used for extracting features with linear discriminant analysis (LDA) used for classification of each time window. Both the SW-LCR and SW-Mode are applied on publicly available BCI Competition IV-2a data set of healthy individuals and on a stroke patients’ data set. Compared with the existing state of the art, the SW-LCR performed better in the case of healthy individuals and SW-Mode performed better on stroke patients’ data set for left- versus right-hand MI with lower standard deviation. For both the data sets, the classification accuracy (CA) was approximately 80% and kappa ( $\kappa $ ) was 0.6. The results show that the sliding window-based prediction of MI using SW-LCR and SW-Mode is robust against intertrial and intersession inconsistencies in the time of activation within a trial and thus can lead to a reliable performance in a neurorehabilitative BCI setting.

106 citations


Journal ArticleDOI
TL;DR: An emergent two dimensional discrete wavelet transform (2D-DWT) based IRT method has been proposed in this article for diagnosing the different bearing faults in IM, namely, inner and outer race defects, and lack of lubrication.
Abstract: Bearing is one of the most crucial parts in induction motor (IM) as a result there is a constant call for effective diagnosis of bearing faults for reliable operation. Infrared thermography (IRT) is appreciably used as a non-destructive and non-contact method to detect the bearing defects in a rotary machine. However, its performance is limited by insignificant information and string noise present in the infrared thermal image. To address this issue, an emergent two dimensional discrete wavelet transform (2D-DWT) based IRT method has been proposed in this article for diagnosing the different bearing faults in IM, namely, inner and outer race defects, and lack of lubrication. The dimensionality of the extracted features was reduced using principal component analysis (PCA) and thereafter the selected features were ranked in the order of most relevant features using the Mahalanobis distance (MD) method to achieve the optimal feature set. Finally these selected features have been passed to the complex decision tree (CDT), linear discriminant analysis (LDA) and support vector machine (SVM) for fault classification and performance evaluation. The classification results reveal that the SVM outperformed CDT and LDA. The proposed strategy can be used for self-adaptive recognition of bearing faults in IM which helps to avoid the unplanned and unwanted system shutdowns due to the bearing failure.

104 citations


Journal ArticleDOI
TL;DR: It has been proved that the method can effectively improve the SAR ATR accuracy when labeled samples are insufficient and the recognition accuracy of the method is significantly higher than other semi-supervised methods.
Abstract: Synthetic aperture radar (SAR) automatic target recognition (ATR) technology is one of the research hotspots in the field of image cognitive learning. Inspired by the human cognitive process, experts have designed convolutional neural network (CNN)-based SAR ATR methods. However, the performance of CNN significantly deteriorates when the labeled samples are insufficient. To effectively utilize the unlabeled samples, we present a novel semi-supervised CNN method. In the training process of our method, the information contained in the unlabeled samples is integrated into the loss function of CNN. Specifically, we first utilize CNN to obtain the class probabilities of the unlabeled samples. Thresholding processing is performed to optimize the class probabilities so that the reliability of the unlabeled samples is improved. Afterward, the optimized class probabilities are used to calculate the scatter matrices of the linear discriminant analysis (LDA) method. Finally, the loss function of CNN is modified by the scatter matrices. We choose ten types of targets from the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset. The experimental results show that the recognition accuracy of our method is significantly higher than other semi-supervised methods. It has been proved that our method can effectively improve the SAR ATR accuracy when labeled samples are insufficient.

82 citations


Journal ArticleDOI
TL;DR: Employing radiomic machine learning/deep learning algorithms could help radiologists to differentiate the histologic subtypes of NSCLC via PET/CT images via radiomics.
Abstract: To evaluate the capability of PET/CT images for differentiating the histologic subtypes of non-small cell lung cancer (NSCLC) and to identify the optimal model from radiomics-based machine learning/deep learning algorithms. In this study, 867 patients with adenocarcinoma (ADC) and 552 patients with squamous cell carcinoma (SCC) were retrospectively analysed. A stratified random sample of 283 patients (20%) was used as the testing set (173 ADC and 110 SCC); the remaining data were used as the training set. A total of 688 features were extracted from each outlined tumour region. Ten feature selection techniques, ten machine learning (ML) models and the VGG16 deep learning (DL) algorithm were evaluated to construct an optimal classification model for the differential diagnosis of ADC and SCC. Tenfold cross-validation and grid search technique were employed to evaluate and optimize the model hyperparameters on the training dataset. The area under the receiver operating characteristic curve (AUROC), accuracy, precision, sensitivity and specificity was used to evaluate the performance of the models on the test dataset. Fifty top-ranked subset features were selected by each feature selection technique for classification. The linear discriminant analysis (LDA) (AUROC, 0.863; accuracy, 0.794) and support vector machine (SVM) (AUROC, 0.863; accuracy, 0.792) classifiers, both of which coupled with the l2,1NR feature selection method, achieved optimal performance. The random forest (RF) classifier (AUROC, 0.824; accuracy, 0.775) and l2,1NR feature selection method (AUROC, 0.815; accuracy, 0.764) showed excellent average performance among the classifiers and feature selection methods employed in our study, respectively. Furthermore, the VGG16 DL algorithm (AUROC, 0.903; accuracy, 0.841) outperformed all conventional machine learning methods in combination with radiomics. Employing radiomic machine learning/deep learning algorithms could help radiologists to differentiate the histologic subtypes of NSCLC via PET/CT images.

72 citations


Journal ArticleDOI
TL;DR: Empirical wavelet transform (EWT) helped to explore the hidden patterns of MI tasks by decomposing EEG data into different modes and regularization parameter tuning of NCA guaranteed to improve classification performance with significant features for each subject.
Abstract: Background: Analysis and classification of extensive medical data (e.g. electroencephalography (EEG) signals) is a significant challenge to develop effective brain–computer interface (BCI) system. Therefore, it is necessary to build automated classification framework to decode different brain signals. Methods: In the present study, two-step filtering approach is utilize to achieve resilience towards cognitive and external noises. Then, empirical wavelet transform (EWT) and four data reduction techniques; principal component analysis (PCA), independent component analysis (ICA), linear discriminant analysis (LDA) and neighborhood component analysis (NCA) are first time integrated together to explore dynamic nature and pattern mining of motor imagery (MI) EEG signals. Specifically, EWT helped to explore the hidden patterns of MI tasks by decomposing EEG data into different modes where every mode was consider as a feature vector in this study and each data reduction technique have been applied to all these modes to reduce the dimension of huge feature matrix. Moreover, an automated correlation-based components/coefficients selection criteria and parameter tuning were implemented for PCA, ICA, LDA, and NCA respectively. For the comparison purposes, all the experiments were performed on two publicly available datasets (BCI competition III dataset IVa and IVb). The performance of the experiments was verified by decoding three different channel combination strategies along with several neural networks. The regularization parameter tuning of NCA guaranteed to improve classification performance with significant features for each subject. Results: The experimental results revealed that NCA provides an average sensitivity, specificity, accuracy, precision, F1 score and kappa-coefficient of 100% for subject dependent case whereas 93%, 93%, 92.9%, 93%, 96.4% and 90% for subject independent case respectively. All the results were obtained with artificial neural networks, cascade-forward neural networks and multilayer perceptron neural networks (MLP) for subject dependent case while with MLP for subject independent case by utilizing 7 channels out of total 118. Such an increase in results can alleviate users to explain more clearly their MI activities. For instance, physically impaired person will be able to manage their wheelchair quite effectively, and rehabilitated persons may be able to improve their activities.

71 citations


Journal ArticleDOI
TL;DR: A new hybrid intelligent system that hybridizes three algorithms, i.e., linear discriminant analysis for dimensionality reduction, support vector machine for classification and genetic algorithm for SVM optimization, and one black box model, namely LDA–GA–SVM, is constructed is constructed.
Abstract: Hepatocellular carcinoma (HCC) is a common type of liver cancer worldwide. Patients with HCC have rare chances of survival. The chances of survival increase, if the cancer is diagnosed early. Hence, different machine learning-based methods have been developed by researchers for the accurate detection of HCC. However, high dimensionality (curse of dimensionality) and lower prediction accuracy are the problems in the automated detection of HCC. Dimensionality reduction-based methods have shown state-of-the-art performance on many disease detection problems, which motivates the development of machine learning models based on reduced features dimension. This paper proposes a new hybrid intelligent system that hybridizes three algorithms, i.e., linear discriminant analysis (LDA) for dimensionality reduction, support vector machine (SVM) for classification and genetic algorithm (GA) for SVM optimization. Consequently, the three models are hybridized and one black box model, namely LDA–GA–SVM, is constructed. Experimental results on publicly available HCC dataset show improvement in the HCC prediction accuracy. Apart from performance improvement, the proposed method also shows lower complexity from two aspects, i.e., reduced processing time in terms of hyperparameters optimization and training time. The proposed method achieved accuracy of 90.30%, sensitivity of 82.25%, specificity of 96.07% and Matthews Correlation Coefficient (MCC) of 0.804.

67 citations


Journal ArticleDOI
TL;DR: An automated EEG based emotion recognition method with a novel fractal pattern feature extraction approach is presented and has been tested on emotional EEG signals with 14 channels using linear discriminant, k-nearest neighborhood, support vector machine, and SVM.
Abstract: Electroencephalogram (EEG) signal analysis is one of the mostly studied research areas in biomedical signal processing, and machine learning. Emotion recognition through machine intelligence plays critical role in understanding the brain activities as well as in developing decision-making systems. In this research, an automated EEG based emotion recognition method with a novel fractal pattern feature extraction approach is presented. The presented fractal pattern is inspired by Firat University Logo and named fractal Firat pattern (FFP). By using FFP and Tunable Q-factor Wavelet Transform (TQWT) signal decomposition technique, a multilevel feature generator is presented. In the feature selection phase, an improved iterative selector is utilized. The shallow classifiers have been considered to denote the success of the presented TQWT and FFP based feature generation. This model has been tested on emotional EEG signals with 14 channels using linear discriminant (LDA), k-nearest neighborhood (k-NN), support vector machine (SVM). The proposed framework achieved 99.82% with SVM classifier.

65 citations


Journal ArticleDOI
TL;DR: In this article, a new approach for extension of univariate iterative filtering (IF) for decomposing a signal into intrinsic mode functions (IMFs) or oscillatory modes is proposed for multivariate multi-component signals.

Journal ArticleDOI
TL;DR: This work designs an unsupervised nonnegative matrix factorization (NMF)-based method called discriminative multiview subspace matrix factorsization (DMSMF) for clustering and designs an effective optimization algorithm with proven convergence to obtain an optimal solution procedure for the complex model.

Journal ArticleDOI
TL;DR: In this article, bagging ensemble learning method with decision tree has achieved the best performance in predicting heart disease, which is the deadliest disease and one of leading causes of death worldwide.
Abstract: Heart disease is the deadliest disease and one of leading causes of death worldwide. Machine learning is playing an essential role in the medical side. In this paper, ensemble learning methods are used to enhance the performance of predicting heart disease. Two features of extraction methods: linear discriminant analysis (LDA) and principal component analysis (PCA), are used to select essential features from the dataset. The comparison between machine learning algorithms and ensemble learning methods is applied to selected features. The different methods are used to evaluate models: accuracy, recall, precision, F-measure, and ROC.The results show the bagging ensemble learning method with decision tree has achieved the best performance.

Journal ArticleDOI
TL;DR: This study proposes a systematic framework incorporating of (a) six feature selection schemes, (b) construction of feature ensembles, and (c) the implementation of eight general ML classifiers for the classification of solid solution high-entropy alloy phases.

Journal ArticleDOI
TL;DR: A new approach using a deep convolutional neural network (CNN) as a generic feature extractor for intelligent classification of different corn seed varieties is presented.

Journal ArticleDOI
TL;DR: An ensemble classification-based methodology for malware detection is proposed, with the best performance achieved by an ensemble of five dense and CNN neural networks, and the ExtraTrees classifier as a meta-learner.
Abstract: The security of information is among the greatest challenges facing organizations and institutions. Cybercrime has risen in frequency and magnitude in recent years, with new ways to steal, change and destroy information or disable information systems appearing every day. Among the types of penetration into the information systems where confidential information is processed is malware. An attacker injects malware into a computer system, after which he has full or partial access to critical information in the information system. This paper proposes an ensemble classification-based methodology for malware detection. The first-stage classification is performed by a stacked ensemble of dense (fully connected) and convolutional neural networks (CNN), while the final stage classification is performed by a meta-learner. For a meta-learner, we explore and compare 14 classifiers. For a baseline comparison, 13 machine learning methods are used: K-Nearest Neighbors, Linear Support Vector Machine (SVM), Radial basis function (RBF) SVM, Random Forest, AdaBoost, Decision Tree, ExtraTrees, Linear Discriminant Analysis, Logistic, Neural Net, Passive Classifier, Ridge Classifier and Stochastic Gradient Descent classifier. We present the results of experiments performed on the Classification of Malware with PE headers (ClaMP) dataset. The best performance is achieved by an ensemble of five dense and CNN neural networks, and the ExtraTrees classifier as a meta-learner.

Journal ArticleDOI
TL;DR: In this article, a novel fuzzy tree classification approach was introduced for Covid-19 detection using three classes of data sets such as Covid19, pneumonia, and normal chest x-ray images.

Journal ArticleDOI
TL;DR: It is verified that GANSO can effectively improve the classifier performance, while the benchmark method SMOTE is not appropriate to deal with such a small size of the training set.
Abstract: In this work, we propose a new method for oversampling the training set of a classifier, in a scenario of extreme scarcity of training data. It is based on two concepts: Generative Adversarial Networks (GAN) and vector Markov Random Field (vMRF). Thus, the generative block of GAN uses the vMRF model to synthesize surrogates by the Graph Fourier Transform. Then, the discriminative block implements a linear discriminant on features measuring clique similarities between the synthesized and the original instances. Both blocks iterate until the linear discriminant cannot discriminate the synthetic from the original instances. We have assessed the new method, called Generative Adversarial Network Synthesis for Oversampling (GANSO), with both simulated and real data in experiments where the classifier is to be trained with just 3 or 5 instances. The applications consisted of classification of stages of neuropsychological tests using electroencephalographic (EEG) and functional magnetic resonance imaging (fMRI) data and classification of sleep stages using electrocardiographic (ECG) data. We have verified that GANSO can effectively improve the classifier performance, while the benchmark method SMOTE is not appropriate to deal with such a small size of the training set.

Journal ArticleDOI
TL;DR: The overall results demonstrate that NIR combined with a multi-variable selection method can constitute a potential tool to understand the most important features involved in the evaluation of dianhong black tea quality helping the instrument manufacturers to achieve the development of low-cost and handheld NIR sensors.

Journal ArticleDOI
TL;DR: An attempt to recognize seven emotional states from speech signals, known as sad, angry, disgust, happy, surprise, pleasant, and neutral sentiment, is investigated, which employs a non-linear signal quantifying method based on randomness measure,known as the entropy feature, for the detection of emotions.
Abstract: Emotion recognition system from speech signal is a widely researched topic in the design of the Human–Computer Interface (HCI) models, since it provides insights into the mental states of human beings. Often, it is required to identify the emotional condition of the humans as cognitive feedback in the HCI. In this paper, an attempt to recognize seven emotional states from speech signals, known as sad, angry, disgust, happy, surprise, pleasant, and neutral sentiment, is investigated. The proposed method employs a non-linear signal quantifying method based on randomness measure, known as the entropy feature, for the detection of emotions. Initially, the speech signals are decomposed into Intrinsic Mode Function (IMF), where the IMF signals are divided into dominant frequency bands such as the high frequency, mid-frequency , and base frequency. The entropy measures are computed directly from the high-frequency band in the IMF domain. However, for the mid- and base-band frequencies, the IMFs are averaged and their entropy measures are computed. A feature vector is formed from the computed entropy measures incorporating the randomness feature for all the emotional signals. Then, the feature vector is used to train a few state-of-the-art classifiers, such as Linear Discriminant Analysis (LDA), Naive Bayes, K-Nearest Neighbor, Support Vector Machine, Random Forest, and Gradient Boosting Machine. A tenfold cross-validation, performed on a publicly available Toronto Emotional Speech dataset, illustrates that the LDA classifier presents a peak balanced accuracy of 93.3%, F1 score of 87.9%, and an area under the curve value of 0.995 in the recognition of emotions from speech signals of native English speakers.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a method of gait analysis accounting for both known and unknown covariate conditions, i.e., a convolutional neural network (CNN) based gait recognition and a discriminative features-based classification method for unknown covariates.
Abstract: Gait is a unique non-invasive biometric form that can be utilized to effectively recognize persons, even when they prove to be uncooperative. Computer-aided gait recognition systems usually use image sequences without considering covariates like clothing and possessions of carrier bags whilst on the move. Similarly, in gait recognition, there may exist unknown covariate conditions that may affect the training and testing conditions for a given individual. Consequently, common techniques for gait recognition and measurement require a degree of intervention leading to the introduction of unknown covariate conditions, and hence this significantly limits the practical use of the present gait recognition and analysis systems. To overcome these key issues, we propose a method of gait analysis accounting for both known and unknown covariate conditions. For this purpose, we propose two methods, i.e., a Convolutional Neural Network (CNN) based gait recognition and a discriminative features-based classification method for unknown covariate conditions. The first method can handle known covariate conditions efficiently while the second method focuses on identifying and selecting unique covariate invariant features from the gallery and probe sequences. The feature set utilized here includes Local Binary Patterns (LBP), Histogram of Oriented Gradients (HOG), and Haralick texture features. Furthermore, we utilize the Fisher Linear Discriminant Analysis for dimensionality reduction and selecting the most discriminant features. Three classifiers, namely Random Forest, Support Vector Machine (SVM), and Multilayer Perceptron are used for gait recognition under strict unknown covariate conditions. We evaluated our results using CASIA and OUR-ISIR datasets for both clothing and speed variations. As a result, we report that on average we obtain an accuracy of 90.32% for the CASIA dataset with unknown covariates and similarly performed excellently on the ISIR dataset. Therefore, our proposed method outperforms existing methods for gait recognition under known and unknown covariate conditions.

Journal ArticleDOI
TL;DR: In this article, a novel data-driven fault diagnosis method by combining deep canonical variate analysis and Fisher discriminant analysis (DCVA-FDA) is proposed for complex industrial processes.
Abstract: In this article, a novel data-driven fault diagnosis method by combining deep canonical variate analysis and Fisher discriminant analysis (DCVA-FDA) is proposed for complex industrial processes. Inspired by the recently developed deep canonical correlation analysis, a new nonlinear canonical variate analysis (CVA) called DCVA is first developed by incorporating deep neural networks into CVA. Based on DCVA, a residual generator is designed for the fault diagnosis process. FDA is applied in the feature space spanned by residual vectors. Then, a Bayesian inference classifier is performed in the reduced dimensional space of FDA to label the class of process data. A continuous stirred-tank reactor and an industrial benchmark of the Tennessee Eastman process are carried out to test the performance of DCVA-FDA fault diagnosis. The experimental results demonstrate that the proposed DCVA-FDA fault diagnosis is able to significantly improve the fault diagnosis performance when compared to other methods also examined in this article.

Journal ArticleDOI
01 Jul 2021
TL;DR: In this paper, the authors used machine learning algorithms with thyroid disease to categorize thyroid disease into three categories: hyperthyroidism, hypothyroidism and normal, so they worked on this study using data from Iraqi people.
Abstract: With the vast amount of data and information difficult to deal with, especially in the health system, machine learning algorithms and data mining techniques have an important role in dealing with data. In our study, we used machine learning algorithms with thyroid disease. The goal of this study is to categorize thyroid disease into three categories: hyperthyroidism, hypothyroidism, and normal, so we worked on this study using data from Iraqi people, some of whom have an overactive thyroid gland and others who have hypothyroidism, so we used all of the algorithms. Support vector machines, random forest, decision tree, naive bayes, logistic regression, k-nearest neighbors, multi-layer perceptron (MLP), linear discriminant analysis. To classification of thyroid disease.

Journal ArticleDOI
TL;DR: In this article, a semantic HOI recognition system based on multi-vision sensors is proposed, where the de-noised RGB and depth images are segmented into multiple clusters using a Simple Linear Iterative Clustering (SLIC) algorithm.
Abstract: Human-Object Interaction (HOI) recognition, due to its significance in many computer vision-based applications, requires in-depth and meaningful details from image sequences. Incorporating semantics in scene understanding has led to a deep understanding of human-centric actions. Therefore, in this research work, we propose a semantic HOI recognition system based on multi-vision sensors. In the proposed system, the de-noised RGB and depth images, via Bilateral Filtering (BLF), are segmented into multiple clusters using a Simple Linear Iterative Clustering (SLIC) algorithm. The skeleton is then extracted from segmented RGB and depth images via Euclidean Distance Transform (EDT). Human joints, extracted from the skeleton, provide the annotations for accurate pixel-level labeling. An elliptical human model is then generated via a Gaussian Mixture Model (GMM). A Conditional Random Field (CRF) model is trained to allocate a specific label to each pixel of different human body parts and an interaction object. Two semantic feature types that are extracted from each labeled body part of the human and labelled objects are: Fiducial points and 3D point cloud. Features descriptors are quantized using Fisher’s Linear Discriminant Analysis (FLDA) and classified using K-ary Tree Hashing (KATH). In experimentation phase the recognition accuracy achieved with the Sports dataset is 92.88%, with the Sun Yat-Sen University (SYSU) 3D HOI dataset is 93.5% and with the Nanyang Technological University (NTU) RGB+D dataset it is 94.16%. The proposed system is validated via extensive experimentation and should be applicable to many computer-vision based applications such as healthcare monitoring, security systems and assisted living etc.

Journal ArticleDOI
TL;DR: In this paper, a three-step classification approach based on time-dependent spectral features (TDSFs) for the classification of complex power quality disturbances (PQDs) is proposed.
Abstract: Power quality events caused by renewable-energy integration are usually associated with complex disturbances; therefore, their type identification is the primary task of subsequent pollution control. This study proposes a novel three-step classification approach based on time-dependent spectral features (TDSFs) for the classification of complex power quality disturbances (PQDs). First, the improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) is adopted to decompose the PQDs into several intrinsic mode functions (IMFs). The related IMFs are selected by correlation coefficient and kurtosis. Then the eight eigenvalues of each IMF are extracted, including TDSFs. Finally, the eigenvalue dimension of each IMF is reduced by linear discriminant analysis (LDA). Moreover, the classifier of the adaptive $k$ -nearest neighbor with excluding outliers (AdaKNNEO) can confirm the PQDs type. To verify the effectiveness of the proposed approach, a series of simulations and hardware experiments are conducted. The overall result shows robustness and high accuracy of the proposed method, and especially for the complex PQDs, it possesses the highest entirely correct of 96% compared to other advanced methods.

Journal ArticleDOI
TL;DR: A novel gesture recognition system, in which three channels of sEMG signals can classify nine gestures, and the linear discriminant analysis was adopted as the classifier, which is feasible to identify more gestures with less sensors.
Abstract: Surface electromyogram (sEMG) signals have been used to control multifunctional prosthetic hands. Researchers usually focused on the use of several channels with sEMG signals to identify more gestures without limiting the number of sEMG sensors. However, the residual muscles of an amputee are limited. Therefore, the point of a successful recognition system is to decrease the channels of sEMG signals to classify more gestures. To achieve this goal, we proposed a novel gesture recognition system, in which three channels of sEMG signals can classify nine gestures. In this recognition system, the time domain features, root mean square ratio, and autoregressive model, were selected to extract the features of the sEMG signals as compared with the time–frequency domain features. Furthermore, the linear discriminant analysis was adopted as the classifier. Consequently, the average accuracy rate of the presented system was 91.7%. Therefore, the proposed gesture recognition system is feasible to identify more gestures with less sensors.

Journal ArticleDOI
TL;DR: A novel framework of signal intelligent classification is proposed based on deep learning networks in ICRNs that is able to learn hierarchical features accurately and achieve excellent signal classification performance and can effectively overcome the negative impact caused by feature parameters uncertainty.
Abstract: With the proliferation of mobile access services and wireless devices, spectrum resources are increasingly becoming scarce. Industrial wireless sensor networks may have to share frequency bands with other systems and suffer from considerable interference. To address that, industrial cognitive radio networks (ICRNs) were developed for effective spectrum sharing, where signal classification is a fundamental and important technology, especially for industrial wireless devices, which need to identify suspicious transmissions. In this article, a novel framework of signal intelligent classification is proposed based on deep learning networks in ICRNs. In the proposed framework, wireless signals will be preprocessed first by Choi–Williams distribution time–frequency analysis and represented by two-dimensional time–frequency images. Then, features of wireless signals are extracted through stack hybrid autoencoders (SHAEs). To accommodate general cases, we design multiple signal classification methods, including unsupervised, semisupervised, and supervised methods, which are processed by Softmax function, semisupervised linear discriminant function, and Fisher discriminant function, respectively. Finally, simulation studies are conducted and the corresponding simulation results show that the proposed framework is able to learn hierarchical features accurately and achieve excellent signal classification performance. Moreover, it can effectively overcome the negative impact caused by feature parameters uncertainty.

Proceedings ArticleDOI
14 Apr 2021
TL;DR: This work proposes various Machine Learning models for the detection of stress on individuals using a publicly available multimodal dataset, WESAD, and finds that the Random Forest model outperformed other models for both binary classification and three-class classification.
Abstract: Mental states like stress, depression, and anxiety have become a huge problem in our modern society. The main objective of this work is to detect stress among people, using Machine Learning approaches with the final aim of improving their quality of life. We propose various Machine Learning models for the detection of stress on individuals using a publicly available multimodal dataset, WESAD. Sensor data including electrocardiogram (ECG), body temperature (TEMP), respiration (RESP), electromyogram (EMG), and electrodermal activity (EDA) are taken for three physiological conditions - neutral (baseline), stress and amusement. The F1-score and accuracy for three-class (amusement vs. baseline vs. stress) and binary (stress vs. non-stress) classifications were computed and compared using machine learning techniques like k-NN, Linear Discriminant Analysis, Random Forest, AdaBoost, and Support Vector Machine. For both binary classification and three-class classification, the Random Forest model outperformed other models with F1-scores of 83.34 and 65.73 respectively.

Journal ArticleDOI
TL;DR: High classification accuracy suggests that the proposed PGP and TEP based model can be used for heart sound classification using PCG signals, and a novel multilevel feature generation network was developed.

Journal ArticleDOI
TL;DR: This paper investigates the feasibility of using machine learning (ML) based MPPT techniques, to harness maximum power on a PV system under PSC, and demonstrates that WK-NN performs significantly better when compared with other proposed ML based algorithms.
Abstract: The rapid growth of demand for electrical energy and the depletion of fossil fuels opened the door for renewable energy; with solar energy being one of the most popular sources, as it is considered pollution free, freely available and requires minimal maintenance. This paper investigates the feasibility of using machine learning (ML) based MPPT techniques, to harness maximum power on a PV system under PSC. In this study, certain contributions to the field of PV systems and ML based systems were made by introducing nine (9) ML based MPPT techniques, by presenting three (3) experiments under different weather conditions. Decision Tree (DT), Multivariate Linear Regression (MLR), Gaussian Process Regression (GPR), Weighted K-Nearest Neighbors (WK-NN), Linear Discriminant Analysis (LDA), Bagged Tree (BT), Naive Bayes classifier (NBC), Support Vector Machine (SVM) and Recurrent Neural Network (RNN) performances are validated and proved using MATLAB SIMULINK simulation software. The experimental results demonstrated that WK-NN performs significantly better when compared with other proposed ML based algorithms.