scispace - formally typeset
Search or ask a question

Showing papers in "Neurocomputing in 2015"


Journal ArticleDOI
TL;DR: A supervised method based on feature and ensemble learning is presented to tackle the problem of retinal blood vessel segmentation, which combines two superior classifiers: Convolutional Neural Network and Random Forest.
Abstract: Segmentation of retinal blood vessels is of substantial clinical importance for diagnoses of many diseases, such as diabetic retinopathy, hypertension and cardiovascular diseases. In this paper, the supervised method is presented to tackle the problem of retinal blood vessel segmentation, which combines two superior classifiers: Convolutional Neural Network (CNN) and Random Forest (RF). In this method, CNN performs as a trainable hierarchical feature extractor and ensemble RFs work as a trainable classifier. By integrating the merits of feature learning and traditional classifier, the proposed method is able to automatically learn features from the raw images and predict the patterns. Extensive experiments have been conducted on two public retinal images databases (DRIVE and STARE), and comparisons with other major studies on the same database demonstrate the promising performance and effectiveness of the proposed method. A supervised method based on feature and ensemble learning is proposed.The whole pipeline of the proposed method is automatic and trainable.Convolutional Neural Network performs as a trainable hierarchical feature extractor.Ensemble Random Forests work as a trainable classifier.Compared with state-of-the-arts, the experimental results are promising.

344 citations


Journal ArticleDOI
TL;DR: A novel feature selection algorithm based on Ant Colony Optimization (ACO) called Advanced Binary ACO (ABACO), is presented and simulation results verify that the algorithm provides a suitable feature subset with good classification accuracy using a smaller feature set than competing feature selection methods.
Abstract: Feature selection is an important task for data analysis and information retrieval processing, pattern classification systems, and data mining applications. It reduces the number of features by removing noisy, irrelevant and redundant data. In this paper, a novel feature selection algorithm based on Ant Colony Optimization (ACO), called Advanced Binary ACO (ABACO), is presented. Features are treated as graph nodes to construct a graph model and are fully connected to each other. In this graph, each node has two sub-nodes, one for selecting and the other for deselecting the feature. Ant colony algorithm is used to select nodes while ants should visit all features. The use of several statistical measures is examined as the heuristic function for visibility of the edges in the graph. At the end of a tour, each ant has a binary vector with the same length as the number of features, where 1 implies selecting and 0 implies deselecting the corresponding feature. The performance of proposed algorithm is compared to the performance of Binary Genetic Algorithm (BGA), Binary Particle Swarm Optimization (BPSO), CatfishBPSO, Improved Binary Gravitational Search Algorithm (IBGSA), and some prominent ACO-based algorithms on the task of feature selection on 12 well-known UCI datasets. Simulation results verify that the algorithm provides a suitable feature subset with good classification accuracy using a smaller feature set than competing feature selection methods.

266 citations


Journal ArticleDOI
TL;DR: The experimental results show that the Wavelet-SVM approach not only has the best forecasting performance compared with the state-of-the-art techniques but also appears to be the most promising and robust based on the historical passenger flow data in Beijing subway system and several standard evaluation measures.
Abstract: In order to effectively manage the use of existing infrastructures and prevent the emergency caused by the large gathered crowd, the short-term passenger flow forecasting technology becomes more and more significant in the field of intelligent transportation system. However, there are few studies discussing how to predict different kinds of passenger flows in the subway system. In this paper, a novel hybrid model Wavelet-SVM is proposed, and it combines the complementary advantages of Wavelet and SVM models, and meanwhile overcomes their shortcomings respectively. The Wavelet-SVM forecasting approach consists of three important stages. The first stage decomposes the passenger flow data into different high frequency and low frequency series by wavelet. During the prediction stage, the SVM method is applied to learn and predict the corresponding high frequency and low frequency series. In the last stage, the diverse predicted sequences are reconstructed by wavelet. The experimental results show that the approach not only has the best forecasting performance compared with the state-of-the-art techniques but also appears to be the most promising and robust based on the historical passenger flow data in Beijing subway system and several standard evaluation measures.

253 citations


Journal ArticleDOI
TL;DR: A novel FOPID controller design method based on an improved multi-objective extremal optimization (MOEO) algorithm for an automatic regulator voltage (AVR) system and the proposed MOEO algorithm is relatively simpler than NSGA-II and single-objectives evolutionary algorithms, such as genetic algorithm, particle swarm optimization (PSO), chaotic anti swarm (CAS) due to its fewer adjustable parameters.
Abstract: Design of an effective and efficient fractional order PID (FOPID) controller, as a generalization of a standard PID controller based on fractional order calculus, for an industrial control system to obtain high-quality performances is of great theoretical and practical significance. From the perspective of multi-objective optimization, this paper presents a novel FOPID controller design method based on an improved multi-objective extremal optimization (MOEO) algorithm for an automatic regulator voltage (AVR) system. The problem of designing FOPID controller for AVR is firstly formulated as a multi-objective optimization problem with three objective functions including minimization of integral of absolute error (IAE), absolute steady-state error, and settling time. Then, an improved MOEO algorithm is proposed to solve this problem by adopting individual-based iterated optimization mechanism and polynomial mutation (PLM). From the perspective of algorithm design, the proposed MOEO algorithm is relatively simpler than NSGA-II and single-objective evolutionary algorithms, such as genetic algorithm (GA), particle swarm optimization (PSO), chaotic anti swarm (CAS) due to its fewer adjustable parameters. Furthermore, the superiority of proposed MOEO-FOPID controller to NSGA-II-based FOPID, single-objective evolutionary algorithms-based FOPID controllers, MOEO-based and NSGA-II-based PID controllers is demonstrated by extensive experimental results on an AVR system in terms of accuracy and robustness.

246 citations


Journal ArticleDOI
TL;DR: This paper addresses a multimodal deep support vector classification (MDSVC) approach, which employs separation-fusion based deep learning in order to perform fault diagnosis tasks for gearboxes, and shows that the proposed model achieves the best fault classification rate in experiments when compared to representative deep and shallow learning methods.
Abstract: Gearboxes are crucial transmission components in mechanical systems. Fault diagnosis is an important tool to maintain gearboxes in healthy conditions. It is challenging to recognize fault existences and, if any, failure patterns in such transmission elements due to their complicated configurations. This paper addresses a multimodal deep support vector classification (MDSVC) approach, which employs separation-fusion based deep learning in order to perform fault diagnosis tasks for gearboxes. Considering that different modalities can be made to describe same object, multimodal homologous features of the gearbox vibration measurements are first separated in time, frequency and wavelet modalities, respectively. A Gaussian-Bernoulli deep Boltzmann machine (GDBM) without final output is subsequently suggested to learn pattern representations for features in each modality. A support vector classifier is finally applied to fuse GDBMs in different modalities towards the construction of the MDSVC model. With the present model, "deep" representations from "wide" modalities improve fault diagnosis capabilities. Fault diagnosis experiments were carried out to evaluate the proposed method on both spur and helical gearboxes. The proposed model achieves the best fault classification rate in experiments when compared to representative deep and shallow learning methods. Results indicate that the proposed separation-fusion based deep learning strategy is effective for the gearbox fault diagnosis.

235 citations


Journal ArticleDOI
TL;DR: This work considers an attacker that aims to maximize the SVM?s classification error by flipping a number of labels in the training data, and formalizes a corresponding optimal attack strategy, and solves it by means of heuristic approaches to keep the computational complexity tractable.
Abstract: Machine learning algorithms are increasingly being applied in security-related tasks such as spam and malware detection, although their security properties against deliberate attacks have not yet been widely understood. Intelligent and adaptive attackers may indeed exploit specific vulnerabilities exposed by machine learning techniques to violate system security. Being robust to adversarial data manipulation is thus an important, additional requirement for machine learning algorithms to successfully operate in adversarial settings. In this work, we evaluate the security of Support Vector Machines (SVMs) to well-crafted, adversarial label noise attacks. In particular, we consider an attacker that aims to maximize the SVM?s classification error by flipping a number of labels in the training data. We formalize a corresponding optimal attack strategy, and solve it by means of heuristic approaches to keep the computational complexity tractable. We report an extensive experimental analysis on the effectiveness of the considered attacks against linear and non-linear SVMs, both on synthetic and real-world datasets. We finally argue that our approach can also provide useful insights for developing more secure SVM learning algorithms, and also novel techniques in a number of related research areas, such as semi-supervised and active learning.

226 citations


Journal ArticleDOI
TL;DR: A deep architecture, AU-inspired Deep Networks (AUDN), inspired by the psychological theory that expressions can be decomposed into multiple facial Action Units (AUs), which can achieve state-of-the-art results on all the databases, and validates the effectiveness of AUDN in both lab-controlled and wild environments.
Abstract: Most existing technologies for facial expression recognition utilize off-the-shelf feature extraction methods for classification. In this paper, aiming at learning better features specific for expression representation, we propose to construct a deep architecture, AU-inspired Deep Networks (AUDN), inspired by the psychological theory that expressions can be decomposed into multiple facial Action Units (AUs). To fully exploit this inspiration but avoid detecting AUs, we propose to automatically learn: (1) informative local appearance variation; (2) optimal way to combining local variation and (3) high level representation for final expression recognition. Accordingly, the proposed AUDN is composed of three sequential modules. Firstly, we build a convolutional layer and a max-pooling layer to learn the Micro-Action-Pattern (MAP) representation, which can explicitly depict local appearance variations caused by facial expressions. Secondly, feature grouping is applied to simulate larger receptive fields by combining correlated MAPs adaptively, aiming to generate more abstract mid-level semantics. Finally, a multi-layer learning process is employed in each receptive field respectively to construct group-wise sub-networks for higher-level representations. Experiments on three expression databases CK+, MMI and SFEW demonstrate that, by simply applying linear classifiers on the learned features, our method can achieve state-of-the-art results on all the databases, which validates the effectiveness of AUDN in both lab-controlled and wild environments.

217 citations


Journal ArticleDOI
TL;DR: Through extensive empirical studies, it is shown that risk minimization under the 0-1 loss, the sigmoid loss and the ramp loss has much better robustness to label noise when compared to the SVM algorithm.
Abstract: In many applications, the training data, from which one needs to learn a classifier, is corrupted with label noise. Many standard algorithms such as SVM perform poorly in the presence of label noise. In this paper we investigate the robustness of risk minimization to label noise. We prove a sufficient condition on a loss function for the risk minimization under that loss to be tolerant to uniform label noise. We show that the 0-1 loss, sigmoid loss, ramp loss and probit loss satisfy this condition though none of the standard convex loss functions satisfy it. We also prove that, by choosing a sufficiently large value of a parameter in the loss function, the sigmoid loss, ramp loss and probit loss can be made tolerant to non-uniform label noise also if we can assume the classes to be separable under noise-free data distribution. Through extensive empirical studies, we show that risk minimization under the 0-1 loss, the sigmoid loss and the ramp loss has much better robustness to label noise when compared to the SVM algorithm.

213 citations


Journal ArticleDOI
TL;DR: A new CAD system that allows the early AD diagnosis using tissue-segmented brain images and is based on several multivariate approaches, such as partial least squares (PLS) and principal component analysis (PCA), which aims to discriminate between AD, mild cognitive impairment (MCI) and elderly normal control (NC) subjects.
Abstract: Computer aided diagnosis (CAD) systems using functional and structural imaging techniques enable physicians to detect early stages of the Alzheimer׳s disease (AD). For this purpose, magnetic resonance imaging (MRI) have been proved to be very useful in the assessment of pathological tissues in AD. This paper presents a new CAD system that allows the early AD diagnosis using tissue-segmented brain images. The proposed methodology aims to discriminate between AD, mild cognitive impairment (MCI) and elderly normal control (NC) subjects and is based on several multivariate approaches, such as partial least squares (PLS) and principal component analysis (PCA). In this study, 188 AD patients, 401 MCI patients and 229 control subjects from the Alzheimer׳s Disease Neuroimaging Initiative (ADNI) database were studied. Automated brain tissue segmentation was performed for each image obtaining gray matter (GM) and white matter (WM) tissue distributions. The validity of the analyzed methods was tested on the ADNI database by implementing support vector machine classifiers with linear or radial basis function (RBF) kernels to distinguish between normal subjects and AD patients. The performance of our methodology is validated using k-fold cross technique where the system based on PLS feature extraction and linear SVM classifier outperformed the PCA method. In addition, PLS feature extraction is found to be more effective for extracting discriminative information from the data. In this regard, the developed latter CAD system yielded maximum sensitivity, specificity and accuracy values of 85.11%, 91.27% and 88.49%, respectively.

213 citations


Journal ArticleDOI
TL;DR: A novel distributed partitioning methodology for prototype reduction techniques in nearest neighbor classification that enables prototype reduction algorithms to be applied over big data classification problems without significant accuracy loss and is a suitable tool to enhance the performance of the nearest neighbor classifier with big data.
Abstract: In the era of big data, analyzing and extracting knowledge from large-scale data sets is a very interesting and challenging task. The application of standard data mining tools in such data sets is not straightforward. Hence, a new class of scalable mining method that embraces the huge storage and processing capacity of cloud platforms is required. In this work, we propose a novel distributed partitioning methodology for prototype reduction techniques in nearest neighbor classification. These methods aim at representing original training data sets as a reduced number of instances. Their main purposes are to speed up the classification process and reduce the storage requirements and sensitivity to noise of the nearest neighbor rule. However, the standard prototype reduction methods cannot cope with very large data sets. To overcome this limitation, we develop a MapReduce-based framework to distribute the functioning of these algorithms through a cluster of computing elements, proposing several algorithmic strategies to integrate multiple partial solutions (reduced sets of prototypes) into a single one. The proposed model enables prototype reduction algorithms to be applied over big data classification problems without significant accuracy loss. We test the speeding up capabilities of our model with data sets up to 5.7 millions of instances. The results show that this model is a suitable tool to enhance the performance of the nearest neighbor classifier with big data.

212 citations


Journal ArticleDOI
TL;DR: In this algorithm, a reinforced memory strategy is designed to update the local leaders of particles for avoiding the degradation of outstanding genes in the particles, and a uniform combination is proposed to balance the local exploitation and the global exploration of algorithm.
Abstract: Feature selection is a useful pre-processing technique for solving classification problems. As an almost parameter-free optimization algorithm, the bare bones particle swarm optimization (BPSO) has been applied to the topic of optimization on continuous or integer spaces, but it has not been applied to feature selection problems with binary variables. In this paper, we propose a new method to find optimal feature subset by the BPSO, called the binary BPSO. In this algorithm, a reinforced memory strategy is designed to update the local leaders of particles for avoiding the degradation of outstanding genes in the particles, and a uniform combination is proposed to balance the local exploitation and the global exploration of algorithm. Moreover, the 1-nearest neighbor method is used as a classifier to evaluate the classification accuracy of a particle. Some international standard data sets are selected to evaluate the proposed algorithm. The experiments show that the proposed algorithm is competitive in terms of both classification accuracy and computational performance.

Journal ArticleDOI
TL;DR: The existence and uniqueness of the equilibrium point for fractional-order Hopfield neural networks with time delay are proved and the global asymptotic stability conditions of fractional/time delay neural networks are obtained by using Lyapunov method.
Abstract: In this paper, the global stability analysis of fractional-order Hopfield neural networks with time delay is investigated A stability theorem for linear fractional-order systems with time delay is presented And, a comparison theorem for a class of fractional-order systems with time delay is shown The existence and uniqueness of the equilibrium point for fractional-order Hopfield neural networks with time delay are proved Furthermore, the global asymptotic stability conditions of fractional-order neural networks with time delay are obtained Finally, a numerical example is given to illustrate the effectiveness of the theoretical results HighlightsThe stability criterion of linear fractional-order systems with time delay is deducedThe existence and uniqueness of equilibrium point for fractional-order time delay neural networks are analyzedGlobal stability conditions of fractional-order time delay neural networks are obtained by using Lyapunov method

Journal ArticleDOI
TL;DR: The purpose of this paper is to present specialized measures directed to assess the imbalance level in multilabel datasets (MLDs) and propose several algorithms designed to reduce the imbalance in MLDs in a classifier-independent way, by means of resampling techniques.
Abstract: The purpose of this paper is to analyze the imbalanced learning task in the multilabel scenario, aiming to accomplish two different goals. The first one is to present specialized measures directed to assess the imbalance level in multilabel datasets (MLDs). Using these measures we will be able to conclude which MLDs are imbalanced, and therefore would need an appropriate treatment. The second objective is to propose several algorithms designed to reduce the imbalance in MLDs in a classifier-independent way, by means of resampling techniques. Two different approaches to divide the instances in minority and majority groups are studied. One of them considers each label combination as class identifier, whereas the other one performs an individual evaluation of each label imbalance level. A random undersampling and a random oversampling algorithm are proposed for each approach, giving as result four different algorithms. All of them are experimentally tested and their effectiveness is statistically evaluated. From the results obtained, a set of guidelines directed to show when these methods should be applied is also provided.

Journal ArticleDOI
TL;DR: The recent progress in visual feature detection is presented and future trends as well as challenges are identified and the relations among different kinds of features are covered.
Abstract: Feature detection is a fundamental and important problem in computer vision and image processing. It is a low-level processing step which serves as the essential part for computer vision based applications. The goal of this paper is to present a survey of recent progress and advances in visual feature detection. Firstly we describe the relations among edges, corners and blobs from the psychological view. Secondly we classify the algorithms in detecting edges, corners and blobs into different categories and provide detailed descriptions for representative recent algorithms in each category. Considering that machine learning becomes more involved in visual feature detection, we put more emphasis on machine learning based feature detection methods. Thirdly, evaluation standards and databases are also introduced. Through this survey we would like to present the recent progress in visual feature detection and identify future trends as well as challenges. We survey the recent progress and advances in visual feature detection.The relations among different kinds of features are covered.Representative feature detection algorithms are described.We categorize and discuss the pros/cons for different kinds of visual features.We put some emphasis on future challenges in feature design through this survey.

Journal ArticleDOI
TL;DR: A hybrid modeling approach which combines Artificial Neural Networks and a simple statistical approach in order to provide a one hour forecast of urban traffic flow rates is shown.
Abstract: In this paper we show a hybrid modeling approach which combines Artificial Neural Networks and a simple statistical approach in order to provide a one hour forecast of urban traffic flow rates. Experimentation has been carried out on three different classes of real streets and results show that the proposed approach outperforms the best of the methods it puts together.

Journal ArticleDOI
TL;DR: This work proposes an efficient extension of t-SNE to a parametric framework, kernel t-sNE, which preserves the flexibility of basic t- SNE, but enables explicit out-of-sample extensions and demonstrates that this technique yields satisfactory results also for large data sets.
Abstract: Novel non-parametric dimensionality reduction techniques such as t-distributed stochastic neighbor embedding (t-SNE) lead to a powerful and flexible visualization of high-dimensional data. One drawback of non-parametric techniques is their lack of an explicit out-of-sample extension. In this contribution, we propose an efficient extension of t-SNE to a parametric framework, kernel t-SNE, which preserves the flexibility of basic t-SNE, but enables explicit out-of-sample extensions. We test the ability of kernel t-SNE in comparison to standard t-SNE for benchmark data sets, in particular addressing the generalization ability of the mapping for novel data. In the context of large data sets, this procedure enables us to train a mapping for a fixed size subset only, mapping all data afterwards in linear time. We demonstrate that this technique yields satisfactory results also for large data sets provided missing information due to the small size of the subset is accounted for by auxiliary information such as class labels, which can be integrated into kernel t-SNE based on the Fisher information.

Journal ArticleDOI
TL;DR: It is observed that the proposed mammogram classification scheme has a better say with respect to accuracy and area under curve (AUC) of receiver operating characteristic (ROC).
Abstract: In this paper, we propose a mammogram classification scheme to classify the breast tissues as normal, benign or malignant. Feature matrix is generated using GLCM to all the detailed coefficients from 2D-DWT of the region of interest (ROI) of a mammogram. To derive the relevant features from the feature matrix, we take the help of t-test and F-test separately. The relevant features are used in a BPNN classifier for classification. Two standard databases MIAS and DDSM are used for the validation of the proposed scheme. It is observed that t-test based relevant features outperforms to that of F-test with respect to accuracy. In addition to the suggested scheme, the competent schemes are also simulated for comparative analysis. It is observed that the proposed scheme has a better say with respect to accuracy and area under curve (AUC) of receiver operating characteristic (ROC). The accuracy measures are computed with respect to normal vs. abnormal and benign vs. malignant. For MIAS database these accuracy measures are 98.0% and 94.2% respectively, whereas for DDSM database they are 98.8% and 97.4%.

Journal ArticleDOI
TL;DR: Comparison with typical forecasting methods such as feed forward neural network (FFNN) shows that the proposed method is applicable to the prediction of foreign exchange rate and works better than traditional methods.
Abstract: Forecasting exchange rates is an important financial problem. In this paper, an improved deep belief network (DBN) is proposed for forecasting exchange rates. By using continuous restricted Boltzmann machines (CRBMs) to construct a DBN, we update the classical DBN to model continuous data. The structure of DBN is optimally determined through experiments for application in exchange rates forecasting. Also, conjugate gradient method is applied to accelerate the learning for DBN. In the experiments, three exchange rate series are tested and six evaluation criteria are adopted to evaluate the performance of the proposed method. Comparison with typical forecasting methods such as feed forward neural network (FFNN) shows that the proposed method is applicable to the prediction of foreign exchange rate and works better than traditional methods.

Journal ArticleDOI
TL;DR: The findings reveal that the hybrid optimization strategy proposed here may be used as a promising alternative forecasting tool for higher forecasting accuracy and better generalization ability and to avoid premature convergence.
Abstract: In this paper, an effective hybrid optimization strategy by incorporating the adaptive optimization of particle swarm optimization (PSO) into genetic algorithm (GA), namely HPSOGA, is used for determining the parameters of radial basis function neural networks (number of neurons, their respective centers and radii) automatically. While this task depends upon operator׳s experience with trial and error due to lack of prior knowledge, or based on gradient algorithms which are highly dependent on initial values. In this paper, hybrid evolutionary algorithms are used to automatically build a radial basis function neural networks (RBF-NN) that solves a specified problem, related to rainfall forecasting in this case. In HPSOGA, individuals in a new generation are created through three approaches to improve the global optimization performance, which are elitist strategy, PSO strategy and GA strategy. The upper-half of the best-performing individuals in a population are regarded as elites, whereas the half of the worst-performing individuals are regarded as a swarm. The group constituted by the elites are enhanced by selection, crossover and mutation operation on these enhanced elites. HPSOGA is applied to RBF-NN design for rainfall prediction. The performance of HPSOGA is compared to pure GA in these basis function neural networks design problems, showing that the hybrid strategy is of more effective global exploration ability and to avoid premature convergence. Our findings reveal that the hybrid optimization strategy proposed here may be used as a promising alternative forecasting tool for higher forecasting accuracy and better generalization ability.

Journal ArticleDOI
TL;DR: A supervised machine learning based solution is proposed for an effective spammer detection and shows that the proposed solution is capable to provide excellent performance with true positive rate of spammers and non-spammers reaching 99.1% and 99.9% respectively.
Abstract: Social network has become a very popular way for internet users to communicate and interact online. Users spend plenty of time on famous social networks (e.g., Facebook, Twitter, Sina Weibo, etc.), reading news, discussing events and posting messages. Unfortunately, this popularity also attracts a significant amount of spammers who continuously expose malicious behavior (e.g., post messages containing commercial URLs, following a larger amount of users, etc.), leading to great misunderstanding and inconvenience on users? social activities. In this paper, a supervised machine learning based solution is proposed for an effective spammer detection. The main procedure of the work is: first, collect a dataset from Sina Weibo including 30,116 users and more than 16 million messages. Then, construct a labeled dataset of users and manually classify users into spammers and non-spammers. Afterwards, extract a set of feature from message content and users? social behavior, and apply into SVM (Support Vector Machines) based spammer detection algorithm. The experiment shows that the proposed solution is capable to provide excellent performance with true positive rate of spammers and non-spammers reaching 99.1% and 99.9% respectively.

Journal ArticleDOI
TL;DR: This paper considers the two factors of multi-label feature, feature dependency and feature redundancy, and proposes an evaluation measure that combines mutual information with a max-dependency and min-redundancy algorithm, which allows to select superior feature subset for multi- label learning.
Abstract: Multi-label learning deals with data belonging to different labels simultaneously. Like traditional supervised feature selection, multi-label feature selection also plays an important role in data mining, information retrieval, and machine learning. In this paper, we first consider the two factors of multi-label feature, feature dependency and feature redundancy. In particular, dependency implies the degree to which a candidate feature contributes to each label, and redundancy represents the information overlap between the candidate feature and the selected features under all labels. We then propose an evaluation measure that combines mutual information with a max-dependency and min-redundancy algorithm, which allows us to select superior feature subset for multi-label learning. Extensive experiments show that the proposed method can effectively select a good feature subset, and outperform some state-of-the-art approaches. HighlightsThe conditional redundancy between the candidate feature and the selected features is considered.The dependency between the candidate feature and all class labels is involved.A metric called max-dependency and min-redundancy is used to evaluate each feature.Extensive experimental results show that the proposed method is effective.

Journal ArticleDOI
TL;DR: This work proposes an outlier-robust ELM where the l 1 -norm loss function is used to enhance the robustness and the fast and accurate augmented Lagrangian multiplier method is applied to guarantee the effectiveness and efficiency.
Abstract: Extreme learning machine (ELM), as one of the most useful techniques in machine learning, has attracted extensive attentions due to its unique ability for extremely fast learning In particular, it is widely recognized that ELM has speed advantage while performing satisfying results However, the presence of outliers may give rise to unreliable ELM model In this paper, our study addresses the outlier robustness of ELM in regression problems Based on the sparsity characteristic of outliers, this work proposes an outlier-robust ELM where the l1-norm loss function is used to enhance the robustness Specially, the fast and accurate augmented Lagrangian multiplier method is applied to guarantee the effectiveness and efficiency According to the experiments on function approximation and some real-world applications, the proposed approach not only maintains the advantages from original ELM, but also shows notable and stable accuracy in handling data with outliers

Journal ArticleDOI
TL;DR: This study shows that taking into account such local characteristics of the minority class distribution can be useful both for analyzing performance of ensembles with respect to data difficulty factors and for proposing new generalizations of bagging.
Abstract: Various approaches to extend bagging ensembles for class imbalanced data are considered. First, we review known extensions and compare them in a comprehensive experimental study. The results show that integrating bagging with under-sampling is more powerful than over-sampling. They also allow to distinguish Roughly Balanced Bagging as the most accurate extension. Then, we point out that complex and difficult distribution of the minority class can be handled by analyzing the content of a neighbourhood of examples. In our study we show that taking into account such local characteristics of the minority class distribution can be useful both for analyzing performance of ensembles with respect to data difficulty factors and for proposing new generalizations of bagging. We demonstrate it by proposing Neighbourhood Balanced Bagging, where sampling probabilities of examples are modified according to the class distribution in their neighbourhood. Two of its versions are considered: the first one keeping a larger size of bootstrap samples by hybrid over-sampling and the other reducing this size with stronger under-sampling. Experiments prove that the first version is significantly better than existing over-sampling bagging extensions while the other version is competitive to Roughly Balanced Bagging. Finally, we demonstrate that detecting types of minority examples depending on their neighbourhood may help explain why some ensembles work better for imbalanced data than others.

Journal ArticleDOI
TL;DR: The combination of model-based identification of the robot geometric errors using EKF and a compensation technique using the ANN could be an effective solution for the correction of all robot error sources.
Abstract: Robot position accuracy plays an important role in advanced industrial applications. In this paper, a new calibration method for enhancing robot position accuracy is proposed. In order to improve robot accuracy, the method first models and identifies its geometric parameters using an extended Kalman filtering (EKF) algorithm. Because the non-geometric error sources (such as the link deflection errors, joint compliance errors, gear backlash, and so on) are either difficult or impossible to model correctly and completely, an artificial neural network (ANN) will be applied to compensate for these un-modeled errors. The combination of model-based identification of the robot geometric errors using EKF and a compensation technique using the ANN could be an effective solution for the correction of all robot error sources. In order to demonstrate the effectiveness and correctness of the proposed method, simulated and experimental studies are carried out on serial PUMA and HH800 manipulators, respectively. The enhanced position accuracy of the robots after calibration confirms the practical effectiveness and correctness of the method.

Journal ArticleDOI
TL;DR: A general learning framework, termed multiple kernel extreme learning machines (MK-ELM), to address the lack of a general framework for ELM to integrate multiple heterogeneous data sources for classification and can achieve comparable or even better classification performance than state-of-the-art MKL algorithms, while incurring much less computational cost.
Abstract: Extreme learning machine (ELM) has been an important research topic over the last decade due to its high efficiency, easy-implementation, unification of classification and regression, and unification of binary and multi-class learning tasks Though integrating these advantages, existing ELM algorithms pay little attention to optimizing the choice of kernels, which is indeed crucial to the performance of ELM in applications More importantly, there is the lack of a general framework for ELM to integrate multiple heterogeneous data sources for classification In this paper, we propose a general learning framework, termed multiple kernel extreme learning machines (MK-ELM), to address the above two issues In the proposed MK-ELM, the optimal kernel combination weights and the structural parameters of ELM are jointly optimized Following recent research on support vector machine (SVM) based MKL algorithms, we first design a sparse MK-ELM algorithm by imposing an ?1-norm constraint on the kernel combination weights, and then extend it to a non-sparse scenario by substituting the ?1-norm constraint with an ?p-norm ( p 1 ) constraint After that, a radius-incorporated MK-ELM algorithm which incorporates the radius of the minimum enclosing ball (MEB) is introduced Three efficient optimization algorithms are proposed to solve the corresponding kernel learning problems Comprehensive experiments have been conducted on Protein, Oxford Flower17, Caltech101 and Alzheimer's disease data sets to evaluate the performance of the proposed algorithms in terms of classification accuracy and computational efficiency As the experimental results indicate, our proposed algorithms can achieve comparable or even better classification performance than state-of-the-art MKL algorithms, while incurring much less computational cost

Journal ArticleDOI
TL;DR: It is shown empirically that the advantage of using the method proposed in this paper is even clearer when noise features are added, and the proposed method has been compared with other baselines and three state-of-the-art MKL methods showing that the approach is often superior.
Abstract: The goal of Multiple Kernel Learning (MKL) is to combine kernels derived from multiple sources in a data-driven way with the aim to enhance the accuracy of a target kernel machine. State-of-the-art methods of MKL have the drawback that the time required to solve the associated optimization problem grows (typically more than linearly) with the number of kernels to combine. Moreover, it has been empirically observed that even sophisticated methods often do not significantly outperform the simple average of kernels. In this paper, we propose a time and space efficient MKL algorithm that can easily cope with hundreds of thousands of kernels and more. The proposed method has been compared with other baselines (random, average, etc.) and three state-of-the-art MKL methods showing that our approach is often superior. We show empirically that the advantage of using the method proposed in this paper is even clearer when noise features are added. Finally, we have analyzed how our algorithm changes its performance with respect to the number of examples in the training set and the number of kernels combined.

Journal ArticleDOI
TL;DR: This paper investigates the problem of stochastic finite-time state estimation for a class of uncertain discrete-time Markovian jump neural networks with time-varying delays with sufficient conditions for the error dynamics to be stochastically finite- time stable.
Abstract: This paper investigates the problem of stochastic finite-time state estimation for a class of uncertain discrete-time Markovian jump neural networks with time-varying delays. A state estimator is designed to estimate the network states through available output measurements such that the resulted error dynamics is stochastically finite-time stable. By stochastic Lyapunov–Krasovskii functional approach, sufficient conditions are derived for the error dynamics to be stochastic finite-time stable. The desired state estimator is designed via linear matrix inequality technique. Simulation examples are provided to illustrate the effectiveness of the obtained results.

Journal ArticleDOI
TL;DR: A classification approach that hybridizes statistical techniques and SOM for network anomaly detection and Probabilistic Self-Organizing Maps (PSOM) aim to model the feature space and enable distinguishing between normal and anomalous connections.
Abstract: The growth of the Internet and, consequently, the number of interconnected computers, has exposed significant amounts of information to intruders and attackers. Firewalls aim to detect violations according to a predefined rule-set and usually block potentially dangerous incoming traffic. However, with the evolution of attack techniques, it is more difficult to distinguish anomalies from normal traffic. Different detection approaches have been proposed, including the use of machine learning techniques based on neural models such as Self-Organizing Maps (SOMs). In this paper, we present a classification approach that hybridizes statistical techniques and SOM for network anomaly detection. Thus, while Principal Component Analysis (PCA) and Fisher Discriminant Ratio (FDR) have been considered for feature selection and noise removal, Probabilistic Self-Organizing Maps (PSOM) aim to model the feature space and enable distinguishing between normal and anomalous connections. The detection capabilities of the proposed system can be modified without retraining the map, but only by modifying the units activation probabilities. This deals with fast implementations of Intrusion Detection Systems (IDS) necessary to cope with current link bandwidths.

Journal ArticleDOI
TL;DR: The results show that the proposed ReliefF extensions improve preceding extensions and overcome some of their drawbacks, and confirm the effectiveness of the proposal for a better multi-label learning.
Abstract: Multi-label learning has become an important area of research due to the increasing number of modern applications that contain multi-label data. The multi-label data are structured in a more complex way than single-label data. Consequently the development of techniques that allow the improvement in the performance of machine learning algorithms over multi-label data is desired. The feature weighting and feature selection algorithms are important feature engineering techniques which have a beneficial impact on the machine learning. The ReliefF algorithm is one of the most popular algorithms to feature estimation and it has proved its usefulness in several domains. This paper presents three extensions of the ReliefF algorithm for working in the multi-label learning context, namely ReliefF-ML, PPT-ReliefF and RReliefF-ML. PPT-ReliefF uses a problem transformation method to convert the multi-label problem into a single-label problem. ReliefF-ML and RReliefF-ML adapt the classic ReliefF algorithm in order to handle directly the multi-label data. The proposed ReliefF extensions are evaluated and compared with previous ReliefF extensions on 34 multi-label datasets. The results show that the proposed ReliefF extensions improve preceding extensions and overcome some of their drawbacks. The experimental results are validated using several nonparametric statistical tests and confirm the effectiveness of the proposal for a better multi-label learning.

Journal ArticleDOI
TL;DR: An ant colony algorithm for synchronous feature selection and parameter optimization for support vector machine in intelligent fault diagnosis of rotating machinery is presented and the advantages of the proposed method are evaluated.
Abstract: The failure of rotating machinery can result in fatal damage and economic loss since rotating machinery plays an important role in the modern manufacturing industry. The development of a reliable and efficient intelligent fault diagnosis approach is an ongoing attempt. Support vector machine (SVM) is a widely used machine learning method in intelligent fault diagnosis. But finding out good features that can discriminate different fault conditions and optimizing parameters for support vector machine can be regarded as the most two important problems that can highly affect the final diagnosis accuracy of support vector machine. Until now, the two issues of feature selection and parameter optimization are usually treated separately, weakening the effects of both efforts. Therefore, an ant colony algorithm for synchronous feature selection and parameter optimization for support vector machine in intelligent fault diagnosis of rotating machinery is presented. Comparing with other methods, the advantages of the proposed method are evaluated on an experiment of rotor system and an engineering application of locomotive roller bearings, which proves it can attain much better results.