scispace - formally typeset
Search or ask a question

Showing papers on "Support vector machine published in 2018"


Journal ArticleDOI
TL;DR: A new CNN based on LeNet-5 is proposed for fault diagnosis which can extract the features of the converted 2-D images and eliminate the effect of handcrafted features and has achieved significant improvements.
Abstract: Fault diagnosis is vital in manufacturing system, since early detections on the emerging problem can save invaluable time and cost. With the development of smart manufacturing, the data-driven fault diagnosis becomes a hot topic. However, the traditional data-driven fault diagnosis methods rely on the features extracted by experts. The feature extraction process is an exhausted work and greatly impacts the final result. Deep learning (DL) provides an effective way to extract the features of raw data automatically. Convolutional neural network (CNN) is an effective DL method. In this study, a new CNN based on LeNet-5 is proposed for fault diagnosis. Through a conversion method converting signals into two-dimensional (2-D) images, the proposed method can extract the features of the converted 2-D images and eliminate the effect of handcrafted features. The proposed method which is tested on three famous datasets, including motor bearing dataset, self-priming centrifugal pump dataset, and axial piston hydraulic pump dataset, has achieved prediction accuracy of 99.79%, 99.481%, and 100%, respectively. The results have been compared with other DL and traditional methods, including adaptive deep CNN, sparse filter, deep belief network, and support vector machine. The comparisons show that the proposed CNN-based data-driven fault diagnosis method has achieved significant improvements.

1,240 citations


Journal ArticleDOI
TL;DR: An overview of machine learning from an applied perspective focuses on the relatively mature methods of support vector machines, single decision trees (DTs), Random Forests, boosted DTs, artificial neural networks, and k-nearest neighbours (k-NN).
Abstract: Machine learning offers the potential for effective and efficient classification of remotely sensed imagery. The strengths of machine learning include the capacity to handle data of high dimensionality and to map classes with very complex characteristics. Nevertheless, implementing a machine-learning classification is not straightforward, and the literature provides conflicting advice regarding many key issues. This article therefore provides an overview of machine learning from an applied perspective. We focus on the relatively mature methods of support vector machines, single decision trees (DTs), Random Forests, boosted DTs, artificial neural networks, and k-nearest neighbours (k-NN). Issues considered include the choice of algorithm, training data requirements, user-defined parameter selection and optimization, feature space impacts and reduction, and computational costs. We illustrate these issues through applying machine-learning classification to two publically available remotely sensed dat...

919 citations


Journal ArticleDOI
TL;DR: The recent progress of SVMs in cancer genomic studies is reviewed and the strength of the SVM learning and its future perspective incancer genomic applications is comprehended.
Abstract: Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications.

635 citations


Journal ArticleDOI
TL;DR: An iterative cluster Primal Dual Splitting algorithm for solving the large-scale sSVM problem in a decentralized fashion, which extracts important features discovered by the algorithm that are predictive of future hospitalizations, thus providing a way to interpret the classification results and inform prevention efforts.

577 citations


Journal ArticleDOI
TL;DR: In this paper, the authors examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets, and show that the predictor converges to the direction of the max-margin (hard margin SVM) solution.
Abstract: We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets. We show the predictor converges to the direction of the max-margin (hard margin SVM) solution. The result also generalizes to other monotone decreasing loss functions with an infimum at infinity, to multi-class problems, and to training a weight layer in a deep network in a certain restricted setting. Furthermore, we show this convergence is very slow, and only logarithmic in the convergence of the loss itself. This can help explain the benefit of continuing to optimize the logistic or cross-entropy loss even after the training error is zero and the training loss is extremely small, and, as we show, even if the validation loss increases. Our methodology can also aid in understanding implicit regularization in more complex models and with other optimization methods.

488 citations


Journal ArticleDOI
TL;DR: Two classification algorithms that use the quantum state space to produce feature maps are demonstrated on a superconducting processor, enabling the solution of problems when the feature space is large and the kernel functions are computationally expensive to estimate.
Abstract: Machine learning and quantum computing are two technologies each with the potential for altering how computation is performed to address previously untenable problems. Kernel methods for machine learning are ubiquitous for pattern recognition, with support vector machines (SVMs) being the most well-known method for classification problems. However, there are limitations to the successful solution to such problems when the feature space becomes large, and the kernel functions become computationally expensive to estimate. A core element to computational speed-ups afforded by quantum algorithms is the exploitation of an exponentially large quantum state space through controllable entanglement and interference. Here, we propose and experimentally implement two novel methods on a superconducting processor. Both methods represent the feature space of a classification problem by a quantum state, taking advantage of the large dimensionality of quantum Hilbert space to obtain an enhanced solution. One method, the quantum variational classifier builds on [1,2] and operates through using a variational quantum circuit to classify a training set in direct analogy to conventional SVMs. In the second, a quantum kernel estimator, we estimate the kernel function and optimize the classifier directly. The two methods present a new class of tools for exploring the applications of noisy intermediate scale quantum computers [3] to machine learning.

463 citations


Journal ArticleDOI
TL;DR: Three machine learning classification algorithms namely Decision Tree, SVM and Naive Bayes are used in this experiment to detect diabetes at an early stage using Pima Indians Diabetes Database which is sourced from UCI machine learning repository.

431 citations


Journal ArticleDOI
TL;DR: Experimental results with widely used hyperspectral image data sets demonstrate that the proposed classification framework, called diverse region-based CNN, can surpass any other conventional deep learning-based classifiers and other state-of-the-art classifiers.
Abstract: Convolutional neural network (CNN) is of great interest in machine learning and has demonstrated excellent performance in hyperspectral image classification. In this paper, we propose a classification framework, called diverse region-based CNN, which can encode semantic context-aware representation to obtain promising features. With merging a diverse set of discriminative appearance factors, the resulting CNN-based representation exhibits spatial-spectral context sensitivity that is essential for accurate pixel classification. The proposed method exploiting diverse region-based inputs to learn contextual interactional features is expected to have more discriminative power. The joint representation containing rich spectral and spatial information is then fed to a fully connected network and the label of each pixel vector is predicted by a softmax layer. Experimental results with widely used hyperspectral image data sets demonstrate that the proposed method can surpass any other conventional deep learning-based classifiers and other state-of-the-art classifiers.

423 citations


Journal ArticleDOI
01 Jan 2018
TL;DR: A comparative study on various reported data splitting methods found that the size of the data is the deciding factor for the qualities of the generalization performance estimated from the validation set, suggesting that it is necessary to have a good balance between the sizes of training set and validation set toHave a reliable estimation of model performance.
Abstract: Model validation is the most important part of building a supervised model. For building a model with good generalization performance one must have a sensible data splitting strategy, and this is crucial for model validation. In this study, we conducted a comparative study on various reported data splitting methods. The MixSim model was employed to generate nine simulated datasets with different probabilities of mis-classification and variable sample sizes. Then partial least squares for discriminant analysis and support vector machines for classification were applied to these datasets. Data splitting methods tested included variants of cross-validation, bootstrapping, bootstrapped Latin partition, Kennard-Stone algorithm (K-S) and sample set partitioning based on joint X–Y distances algorithm (SPXY). These methods were employed to split the data into training and validation sets. The estimated generalization performances from the validation sets were then compared with the ones obtained from the blind test sets which were generated from the same distribution but were unseen by the training/validation procedure used in model construction. The results showed that the size of the data is the deciding factor for the qualities of the generalization performance estimated from the validation set. We found that there was a significant gap between the performance estimated from the validation set and the one from the test set for the all the data splitting methods employed on small datasets. Such disparity decreased when more samples were available for training/validation, and this is because the models were then moving towards approximations of the central limit theory for the simulated datasets used. We also found that having too many or too few samples in the training set had a negative effect on the estimated model performance, suggesting that it is necessary to have a good balance between the sizes of training set and validation set to have a reliable estimation of model performance. We also found that systematic sampling method such as K-S and SPXY generally had very poor estimation of the model performance, most likely due to the fact that they are designed to take the most representative samples first and thus left a rather poorly representative sample set for model performance estimation.

380 citations


Journal ArticleDOI
TL;DR: Well-known machine learning techniques, namely, SVM, random forest, and extreme learning machine (ELM) are applied and the results indicate that ELM outperforms other approaches in intrusion detection mechanisms.
Abstract: Intrusion detection is a fundamental part of security tools, such as adaptive security appliances, intrusion detection systems, intrusion prevention systems, and firewalls. Various intrusion detection techniques are used, but their performance is an issue. Intrusion detection performance depends on accuracy, which needs to improve to decrease false alarms and to increase the detection rate. To resolve concerns on performance, multilayer perceptron, support vector machine (SVM), and other techniques have been used in recent work. Such techniques indicate limitations and are not efficient for use in large data sets, such as system and network data. The intrusion detection system is used in analyzing huge traffic data; thus, an efficient classification technique is necessary to overcome the issue. This problem is considered in this paper. Well-known machine learning techniques, namely, SVM, random forest, and extreme learning machine (ELM) are applied. These techniques are well-known because of their capability in classification. The NSL–knowledge discovery and data mining data set is used, which is considered a benchmark in the evaluation of intrusion detection mechanisms. The results indicate that ELM outperforms other approaches.

379 citations


Journal ArticleDOI
TL;DR: A deep feature fusion network (DFFN) is proposed for HSI classification that fuses the outputs of different hierarchical layers, which can further improve the classification accuracy and outperforms other competitive classifiers.
Abstract: Recently, deep learning has been introduced to classify hyperspectral images (HSIs) and achieved good performance. In general, deep models adopt a large number of hierarchical layers to extract features. However, excessively increasing network depth will result in some negative effects (e.g., overfitting, gradient vanishing, and accuracy degrading) for conventional convolutional neural networks. In addition, the previous networks used in HSI classification do not consider the strong complementary yet correlated information among different hierarchical layers. To address the above two issues, a deep feature fusion network (DFFN) is proposed for HSI classification. On the one hand, the residual learning is introduced to optimize several convolutional layers as the identity mapping, which can ease the training of deep network and benefit from increasing depth. As a result, we can build a very deep network to extract more discriminative features of HSIs. On the other hand, the proposed DFFN model fuses the outputs of different hierarchical layers, which can further improve the classification accuracy. Experimental results on three real HSIs demonstrate that the proposed method outperforms other competitive classifiers.

Proceedings ArticleDOI
20 May 2018
TL;DR: This work proposes attacks on stealing the hyperparameters that are learned by a learner, applicable to a variety of popular machine learning algorithms such as ridge regression, logistic regression, support vector machine, and neural network.
Abstract: Hyperparameters are critical in machine learning, as different hyperparameters often result in models with significantly different performance. Hyperparameters may be deemed confidential because of their commercial value and the confidentiality of the proprietary algorithms that the learner uses to learn them. In this work, we propose attacks on stealing the hyperparameters that are learned by a learner. We call our attacks hyperparameter stealing attacks. Our attacks are applicable to a variety of popular machine learning algorithms such as ridge regression, logistic regression, support vector machine, and neural network. We evaluate the effectiveness of our attacks both theoretically and empirically. For instance, we evaluate our attacks on Amazon Machine Learning. Our results demonstrate that our attacks can accurately steal hyperparameters. We also study countermeasures. Our results highlight the need for new defenses against our hyperparameter stealing attacks for certain machine learning algorithms.

Journal ArticleDOI
TL;DR: Experimental results indicate that while the technical constraints linked to automatic plant disease classification have been largely overcome, the use of limited image datasets for training brings many undesirable consequences that still prevent the effective dissemination of this type of technology.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed fault classification algorithm achieves high diagnosis accuracy for different working conditions of rolling bearing and outperforms some traditional methods both mentioned in this paper and published in other literature.

Posted Content
TL;DR: A comprehensive set of experiments demonstrate that on complex data sets (like CIFAR and PFAM), OC-NN significantly outperforms existing state-of-the-art anomaly detection methods.
Abstract: We propose a one-class neural network (OC-NN) model to detect anomalies in complex data sets. OC-NN combines the ability of deep networks to extract a progressively rich representation of data with the one-class objective of creating a tight envelope around normal data. The OC-NN approach breaks new ground for the following crucial reason: data representation in the hidden layer is driven by the OC-NN objective and is thus customized for anomaly detection. This is a departure from other approaches which use a hybrid approach of learning deep features using an autoencoder and then feeding the features into a separate anomaly detection method like one-class SVM (OC-SVM). The hybrid OC-SVM approach is sub-optimal because it is unable to influence representational learning in the hidden layers. A comprehensive set of experiments demonstrate that on complex data sets (like CIFAR and GTSRB), OC-NN performs on par with state-of-the-art methods and outperformed conventional shallow methods in some scenarios.

Journal ArticleDOI
TL;DR: This paper revisits existing security threats and gives a systematic survey on them from two aspects, the training phase and the testing/inferring phase, and categorizes current defensive techniques of machine learning into four groups: security assessment mechanisms, countermeasures in theTraining phase, those in the testing or inferring phase; data security, and privacy.
Abstract: Machine learning is one of the most prevailing techniques in computer science, and it has been widely applied in image processing, natural language processing, pattern recognition, cybersecurity, and other fields. Regardless of successful applications of machine learning algorithms in many scenarios, e.g., facial recognition, malware detection, automatic driving, and intrusion detection, these algorithms and corresponding training data are vulnerable to a variety of security threats, inducing a significant performance decrease. Hence, it is vital to call for further attention regarding security threats and corresponding defensive techniques of machine learning, which motivates a comprehensive survey in this paper. Until now, researchers from academia and industry have found out many security threats against a variety of learning algorithms, including naive Bayes, logistic regression, decision tree, support vector machine (SVM), principle component analysis, clustering, and prevailing deep neural networks. Thus, we revisit existing security threats and give a systematic survey on them from two aspects, the training phase and the testing/inferring phase. After that, we categorize current defensive techniques of machine learning into four groups: security assessment mechanisms, countermeasures in the training phase, those in the testing or inferring phase, data security, and privacy. Finally, we provide five notable trends in the research on security threats and defensive techniques of machine learning, which are worth doing in-depth studies in future.

Journal ArticleDOI
01 Mar 2018-Catena
TL;DR: The first comprehensive comparison among the performances of ten advanced machine learning techniques (MLTs) including artificial neural networks (ANNs), boosted regression tree (BRT), classification and regression trees (CART), generalized linear model (GLM), generalized additive model (GAM), multivariate adaptive regression splines (MARS), naive Bayes (NB), quadratic discriminant analysis (QDA), random forest (RF), and support vector machines (SVM) is presented.
Abstract: Coupling machine learning algorithms with spatial analytical techniques for landslide susceptibility modeling is a worth considering issue. So, the current research intend to present the first comprehensive comparison among the performances of ten advanced machine learning techniques (MLTs) including artificial neural networks (ANNs), boosted regression tree (BRT), classification and regression trees (CART), generalized linear model (GLM), generalized additive model (GAM), multivariate adaptive regression splines (MARS), naive Bayes (NB), quadratic discriminant analysis (QDA), random forest (RF), and support vector machines (SVM) for modeling landslide susceptibility and evaluating the importance of variables in GIS and R open source software. This study was carried out in the Ghaemshahr Region, Iran. The performance of MLTs has been evaluated using the area under ROC curve (AUC-ROC) approach. The results showed that AUC values for ten MLTs vary from 62.4 to 83.7%. It has been found that the RF (AUC = 83.7%) and BRT (AUC = 80.7%) have the best performances comparison to other MLTs.

Journal ArticleDOI
TL;DR: The proposed STL-IDS approach improves network intrusion detection and provides a new research method for intrusion detection, and has accelerated SVM training and testing times and performed better than most of the previous approaches in terms of performance metrics in binary and multiclass classification.
Abstract: Network intrusion detection systems (NIDSs) provide a better solution to network security than other traditional network defense technologies, such as firewall systems The success of NIDS is highly dependent on the performance of the algorithms and improvement methods used to increase the classification accuracy and decrease the training and testing times of the algorithms We propose an effective deep learning approach, self-taught learning (STL)-IDS, based on the STL framework The proposed approach is used for feature learning and dimensionality reduction It reduces training and testing time considerably and effectively improves the prediction accuracy of support vector machines (SVM) with regard to attacks The proposed model is built using the sparse autoencoder mechanism, which is an effective learning algorithm for reconstructing a new feature representation in an unsupervised manner After the pre-training stage, the new features are fed into the SVM algorithm to improve its detection capability for intrusion and classification accuracy Moreover, the efficiency of the approach in binary and multiclass classification is studied and compared with that of shallow classification methods, such as J48, naive Bayesian, random forest, and SVM Results show that our approach has accelerated SVM training and testing times and performed better than most of the previous approaches in terms of performance metrics in binary and multiclass classification The proposed STL-IDS approach improves network intrusion detection and provides a new research method for intrusion detection

Journal ArticleDOI
TL;DR: Investigation of the suitability of deep learning approaches for anomaly-based intrusion detection system based on different deep neural network structures found promising results for real-world application in anomaly detection systems.
Abstract: Due to the monumental growth of Internet applications in the last decade, the need for security of information network has increased manifolds. As a primary defense of network infrastructure, an intrusion detection system is expected to adapt to dynamically changing threat landscape. Many supervised and unsupervised techniques have been devised by researchers from the discipline of machine learning and data mining to achieve reliable detection of anomalies. Deep learning is an area of machine learning which applies neuron-like structure for learning tasks. Deep learning has profoundly changed the way we approach learning tasks by delivering monumental progress in different disciplines like speech processing, computer vision, and natural language processing to name a few. It is only relevant that this new technology must be investigated for information security applications. The aim of this paper is to investigate the suitability of deep learning approaches for anomaly-based intrusion detection system. For this research, we developed anomaly detection models based on different deep neural network structures, including convolutional neural networks, autoencoders, and recurrent neural networks. These deep models were trained on NSLKDD training data set and evaluated on both test data sets provided by NSLKDD, namely NSLKDDTest+ and NSLKDDTest21. All experiments in this paper are performed by authors on a GPU-based test bed. Conventional machine learning-based intrusion detection models were implemented using well-known classification techniques, including extreme learning machine, nearest neighbor, decision-tree, random-forest, support vector machine, naive-bays, and quadratic discriminant analysis. Both deep and conventional machine learning models were evaluated using well-known classification metrics, including receiver operating characteristics, area under curve, precision-recall curve, mean average precision and accuracy of classification. Experimental results of deep IDS models showed promising results for real-world application in anomaly detection systems.

Journal ArticleDOI
TL;DR: Two machine learning approaches for the automatic classification of breast cancer histology images into benign and malignant and into benignand malignant sub-classes are compared.
Abstract: In recent years, the classification of breast cancer has been the topic of interest in the field of Healthcare informatics, because it is the second main cause of cancer-related deaths in women. Breast cancer can be identified using a biopsy where tissue is removed and studied under microscope. The diagnosis is based on the qualification of the histopathologist, who will look for abnormal cells. However, if the histopathologist is not well-trained, this may lead to wrong diagnosis. With the recent advances in image processing and machine learning, there is an interest in attempting to develop a reliable pattern recognition based systems to improve the quality of diagnosis. In this paper, we compare two machine learning approaches for the automatic classification of breast cancer histology images into benign and malignant and into benign and malignant sub-classes. The first approach is based on the extraction of a set of handcrafted features encoded by two coding models (bag of words and locality constrained linear coding) and trained by support vector machines, while the second approach is based on the design of convolutional neural networks. We have also experimentally tested dataset augmentation techniques to enhance the accuracy of the convolutional neural network as well as “handcrafted features + convolutional neural network” and “ convolutional neural network features + classifier” configurations. The results show convolutional neural networks outperformed the handcrafted feature based classifier, where we achieved accuracy between 96.15% and 98.33% for the binary classification and 83.31% and 88.23% for the multi-class classification.

Journal ArticleDOI
TL;DR: An effective feature representation learning model is developed that can extract and learn a set of informative features from a pool of support vector machine-based models trained using sequence-based feature descriptors and provide the most discriminative power for identifying ACPs.
Abstract: Motivation Anti-cancer peptides (ACPs) have recently emerged as promising therapeutic agents for cancer treatment. Due to the avalanche of protein sequence data in the post-genomic era, there is an urgent need to develop automated computational methods to enable fast and accurate identification of novel ACPs within the vast number of candidate proteins and peptides. Results To address this, we propose a novel predictor named Anti-Cancer peptide Predictor with Feature representation Learning (ACPred-FL) for accurate prediction of ACPs based on sequence information. More specifically, we develop an effective feature representation learning model, with which we can extract and learn a set of informative features from a pool of support vector machine-based models trained using sequence-based feature descriptors. By doing so, the class label information of data samples is fully utilized. To improve the feature representation, we further employ a two-step feature selection technique, resulting in a most informative five-dimensional feature vector for the final peptide representation. Experimental results show that such five features provide the most discriminative power for identifying ACPs than currently available feature descriptors, highlighting the effectiveness of the proposed feature representation learning approach. The developed ACPred-FL method significantly outperforms state-of-the-art methods. Availability and implementation The web-server of ACPred-FL is available at http://server.malab.cn/ACPred-FL. Supplementary information Supplementary data are available at Bioinformatics online.

Proceedings Article
01 Jan 2018
TL;DR: In this paper, a convex optimization formulation is proposed to minimize the coding length of stochastic gradient vectors to reduce the communication overhead for exchanging information among different workers in large-scale machine learning applications.
Abstract: Modern large-scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost, we propose a convex optimization formulation to minimize the coding length of stochastic gradients. The key idea is to randomly drop out coordinates of the stochastic gradient vectors and amplify the remaining coordinates appropriately to ensure the sparsified gradient to be unbiased. To solve the optimal sparsification efficiently, several simple and fast algorithms are proposed for an approximate solution, with a theoretical guarantee for sparseness. Experiments on $\ell_2$ regularized logistic regression, support vector machines, and convolutional neural networks validate our sparsification approaches.

Journal ArticleDOI
TL;DR: A three-layer, deep convolutional autoencoder (CAE) is proposed, which utilizes unsupervised pretraining to initialize the weights in the subsequent Convolutional layers, and is shown to be more effective than other deep learning architectures.
Abstract: Radar-based activity recognition is a problem that has been of great interest due to applications such as border control and security, pedestrian identification for automotive safety, and remote health monitoring. This paper seeks to show the efficacy of micro-Doppler analysis to distinguish even those gaits whose micro-Doppler signatures are not visually distinguishable. Moreover, a three-layer, deep convolutional autoencoder (CAE) is proposed, which utilizes unsupervised pretraining to initialize the weights in the subsequent convolutional layers. This architecture is shown to be more effective than other deep learning architectures, such as convolutional neural networks and autoencoders, as well as conventional classifiers employing predefined features, such as support vector machines (SVM), random forest, and extreme gradient boosting. Results show the performance of the proposed deep CAE yields a correct classification rate of 94.2% for micro-Doppler signatures of 12 different human activities measured indoors using a 4 GHz continuous wave radar—17.3% improvement over SVM.

Journal ArticleDOI
TL;DR: In this article, an in- situ defect detection strategy for powder bed fusion (PBF) AM using supervised machine learning is described, where multiple images were collected at each build layer using a high resolution digital single-lens reflex (DSLR) camera.
Abstract: Process monitoring in additive manufacturing (AM) is a crucial component in the mission of broadening AM industrialization. However, conventional part evaluation and qualification techniques, such as computed tomography (CT), can only be utilized after the build is complete, and thus eliminate any potential to correct defects during the build process. In contrast to post-build CT, in situ defect detection based on in situ sensing, such as layerwise visual inspection, enables the potential for in-process re-melting and correction of detected defects and thus facilitates in-process part qualification. This paper describes the development and implementation of such an in situ defect detection strategy for powder bed fusion (PBF) AM using supervised machine learning. During the build process, multiple images were collected at each build layer using a high resolution digital single-lens reflex (DSLR) camera. For each neighborhood in the resulting layerwise image stack, multi-dimensional visual features were extracted and evaluated using binary classification techniques, i.e. a linear support vector machine (SVM). Through binary classification, neighborhoods are then categorized as either a flaw, i.e. an undesirable interruption in the typical structure of the material, or a nominal build condition. Ground truth labels, i.e. the true location of flaws and nominal build areas, which are needed to train the binary classifiers, were obtained from post-build high-resolution 3D CT scan data. In CT scans, discontinuities, e.g. incomplete fusion, porosity, cracks, or inclusions, were identified using automated analysis tools or manual inspection. The xyz locations of the CT data were transferred into the layerwise image domain using an affine transformation, which was estimated using reference points embedded in the part. After the classifier had been properly trained, in situ defect detection accuracies greater than 80% were demonstrated during cross-validation experiments.

Journal ArticleDOI
TL;DR: A simple yet effective method to extract hierarchical deep spatial feature for HSI classification by exploring the power of off-the-shelf CNN models, without any additional retraining or fine-tuning on the target data set is proposed.
Abstract: Hyperspectral image (HSI) classification is an active and important research task driven by many practical applications. To leverage deep learning models especially convolutional neural networks (CNNs) for HSI classification, this paper proposes a simple yet effective method to extract hierarchical deep spatial feature for HSI classification by exploring the power of off-the-shelf CNN models, without any additional retraining or fine-tuning on the target data set. To obtain better classification accuracy, we further propose a unified metric learning-based framework to alternately learn discriminative spectral–spatial features, which have better representation capability and train support vector machine (SVM) classifiers. To this end, we design a new objective function that explicitly embeds a metric learning regularization term into SVM training. The metric learning regularization term is used to learn a powerful spectral–spatial feature representation by fusing spectral feature and deep spatial feature, which has small intraclass scatter but big between class separation. By transforming HSI data into new spectral–spatial feature space through CNN and metric learning, we can pull the pixels from the same class closer, while pushing the different class pixels farther away. In the experiments, we comprehensively evaluate the proposed method on three commonly used HSI benchmark data sets. State-of-the-art results are achieved when compared with the existing HSI classification methods.

Journal ArticleDOI
TL;DR: A novel network architecture, fully Conv–Deconv network for unsupervised spectral–spatial feature learning of hyperspectral images, which is able to be trained in an end-to-end manner and an in-depth investigation of learned features is introduced.
Abstract: Supervised approaches classify input data using a set of representative samples for each class, known as training samples . The collection of such samples is expensive and time demanding. Hence, unsupervised feature learning, which has a quick access to arbitrary amounts of unlabeled data, is conceptually of high interest. In this paper, we propose a novel network architecture, fully Conv–Deconv network, for unsupervised spectral–spatial feature learning of hyperspectral images, which is able to be trained in an end-to-end manner. Specifically, our network is based on the so-called encoder–decoder paradigm, i.e., the input 3-D hyperspectral patch is first transformed into a typically lower dimensional space via a convolutional subnetwork (encoder), and then expanded to reproduce the initial data by a deconvolutional subnetwork (decoder). However, during the experiment, we found that such a network is not easy to be optimized. To address this problem, we refine the proposed network architecture by incorporating: 1) residual learning and 2) a new unpooling operation that can use memorized max-pooling indexes. Moreover, to understand the “black box,” we make an in-depth study of the learned feature maps in the experimental analysis. A very interesting discovery is that some specific “neurons” in the first residual block of the proposed network own good description power for semantic visual patterns in the object level, which provide an opportunity to achieve “free” object detection. This paper, for the first time in the remote sensing community, proposes an end-to-end fully Conv–Deconv network for unsupervised spectral–spatial feature learning. Moreover, this paper also introduces an in-depth investigation of learned features. Experimental results on two widely used hyperspectral data, Indian Pines and Pavia University, demonstrate competitive performance obtained by the proposed methodology compared with other studied approaches.

Posted Content
TL;DR: Adversarial Robustness Toolbox is a Python library supporting developers and researchers in defending Machine Learning models against adversarial threats and helps making AI systems more secure and trustworthy.
Abstract: Adversarial Robustness Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defences and test them with adversarial attacks. Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which is required to test defenses with state-of-the-art threat models. Supported Machine Learning Libraries include TensorFlow (v1 and v2), Keras, PyTorch, MXNet, Scikit-learn, XGBoost, LightGBM, CatBoost, and GPy. The source code of ART is released with MIT license at this https URL. The release includes code examples, notebooks with tutorials and documentation (this http URL).

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed multilayer stacked covariance pooling method can not only consistently outperform the corresponding single-layer model but also achieve better classification performance than other pretrained CNN-based scene classification methods.
Abstract: This paper proposes a new method, called multilayer stacked covariance pooling (MSCP), for remote sensing scene classification The innovative contribution of the proposed method is that it is able to naturally combine multilayer feature maps, obtained by pretrained convolutional neural network (CNN) models Specifically, the proposed MSCP-based classification framework consists of the following three steps First, a pretrained CNN model is used to extract multilayer feature maps Then, the feature maps are stacked together, and a covariance matrix is calculated for the stacked features Each entry of the resulting covariance matrix stands for the covariance of two different feature maps, which provides a natural and innovative way to exploit the complementary information provided by feature maps coming from different layers Finally, the extracted covariance matrices are used as features for classification by a support vector machine The experimental results, conducted on three challenging data sets, demonstrate that the proposed MSCP method can not only consistently outperform the corresponding single-layer model but also achieve better classification performance than other pretrained CNN-based scene classification methods

Journal ArticleDOI
TL;DR: Results show that the EWT outperforms empirical mode decomposition for decomposing the signal into multiple components, and the proposed EWTFSFD method can accurately and effectively achieve the fault diagnosis of motor bearing.
Abstract: Motor bearing is subjected to the joint effects of much more loads, transmissions, and shocks that cause bearing fault and machinery breakdown. A vibration signal analysis method is the most popular technique that is used to monitor and diagnose the fault of motor bearing. However, the application of the vibration signal analysis method for motor bearing is very limited in engineering practice. In this paper, on the basis of comparing fault feature extraction by using empirical wavelet transform (EWT) and Hilbert transform with the theoretical calculation, a new motor bearing fault diagnosis method based on integrating EWT, fuzzy entropy, and support vector machine (SVM) called EWTFSFD is proposed. In the proposed method, a novel signal processing method called EWT is used to decompose vibration signal into multiple components in order to extract a series of amplitude modulated–frequency modulated (AM-FM) components with supporting Fourier spectrum under an orthogonal basis. Then, fuzzy entropy is utilized to measure the complexity of vibration signal, reflect the complexity changes of intrinsic oscillation, and compute the fuzzy entropy values of AM-FM components, which are regarded as the inputs of the SVM model to train and construct an SVM classifier for fulfilling fault pattern recognition. Finally, the effectiveness of the proposed method is validated by using the simulated signal and real motor bearing vibration signals. The experiment results show that the EWT outperforms empirical mode decomposition for decomposing the signal into multiple components, and the proposed EWTFSFD method can accurately and effectively achieve the fault diagnosis of motor bearing.

Journal ArticleDOI
TL;DR: The comparative performance of these techniques is illustrated and it is shown that object localization strategies cope well with cluttered X-ray security imagery, where classification techniques fail, and that fine-tuned CNN features yield superior performance to conventional hand-crafted features on object classification tasks within this context.
Abstract: We consider the use of deep convolutional neural networks (CNNs) with transfer learning for the image classification and detection problems posed within the context of X-ray baggage security imagery. The use of the CNN approach requires large amounts of data to facilitate a complex end-to-end feature extraction and classification process. Within the context of X-ray security screening, limited availability of object of interest data examples can thus pose a problem. To overcome this issue, we employ a transfer learning paradigm such that a pre-trained CNN, primarily trained for generalized image classification tasks where sufficient training data exists, can be optimized explicitly as a later secondary process towards this application domain. To provide a consistent feature-space comparison between this approach and traditional feature space representations, we also train support vector machine (SVM) classifier on CNN features. We empirically show that fine-tuned CNN features yield superior performance to conventional hand-crafted features on object classification tasks within this context. Overall we achieve 0.994 accuracy based on AlexNet features trained with SVM classifier. In addition to classification, we also explore the applicability of multiple CNN driven detection paradigms, such as sliding window-based CNN (SW-CNN), Faster region-based CNNs (F-RCNNs), region-based fully convolutional networks (R-FCN), and YOLOv2. We train numerous networks tackling both single and multiple detections over SW-CNN/ F-RCNN/R-FCN/YOLOv2 variants. YOLOv2, Faster-RCNN, and R-FCN provide superior results to the more traditional SW-CNN approaches. With the use of YOLOv2, using input images of size $544\times 544$ , we achieve 0.885 mean average precision (mAP) for a six-class object detection problem. The same approach with an input of size $416\times 416$ yields 0.974 mAP for the two-class firearm detection problem and requires approximately 100 ms per image. Overall we illustrate the comparative performance of these techniques and show that object localization strategies cope well with cluttered X-ray security imagery, where classification techniques fail.